microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
20.1k stars 2.55k forks source link

How to use layoutlmv3 in industry environment? #765

Closed matthew-wei closed 2 years ago

matthew-wei commented 2 years ago

Describe Model I am using (UniLM, MiniLM, LayoutLM ...):

I want to ues layoutlmv3 for the Document Layout Detection task. https://github.com/microsoft/unilm/tree/master/layoutlmv3 But i can not find the depoly way to accelerate the model. There is no infomation about deploy, for example transform the model to onnx or TensorRT.

So, please teach me the way to accelerate layoutlmv3. @HYPJUDY

wolfshow commented 2 years ago

@matthew-wei Exporting LayoutLMv3 models into ONNX is not difficult because LayoutLMv3 only used standard operators in Transformers.

matthew-wei commented 2 years ago

@matthew-wei Exporting difficult models into ONNX is not difficult because LayoutLMv3 only used standard operators in Transformers.

However, I find it is difficult to export LayoutLMv3 for Document Layout Detection. It use code in dit.

I find this in the paper https://arxiv.org/abs/2204.08387 , 【We integrate the LayoutLMv3 as feature backbone in the Cascade R-CNN detector [4] with FPN [31] implemented using the Detectron2 [46]. We adopt the standard practice to extract single-scale features from different Transformer layers, such as layers 4, 6, 8, and 12 of the LayoutLMv3 base model, and use resolution-modifying modules to convert the single-scale features into the multiscale FPN features [1, 27, 30].】

I find LayoutLMv3 in Transformers, but i can not find example or src code about Document Layout Detection.

wolfshow commented 2 years ago

You may find information at https://github.com/microsoft/unilm/tree/master/layoutlmv3

regisss commented 2 years ago

You can easily deploy LayoutLMv3 to the ONNX format using the Hugging Face Transformers library, see here.

moyans commented 2 years ago

You can easily deploy LayoutLMv3 to the ONNX format using the Hugging Face Transformers library, see here.

@regisss tokenization_layoutlmv3 can not support "layoutlmv3-base-chinses" model ,see here , will it be repaired later?

regisss commented 2 years ago

You can easily deploy LayoutLMv3 to the ONNX format using the Hugging Face Transformers library, see here.

@regisss tokenization_layoutlmv3 can not support "layoutlmv3-base-chinses" model ,see here , will it be repaired later?

Not sure about this, could you ask @NielsRogge in the issue you pointed to?

iweirman commented 1 year ago

@matthew-wei I am also grappling with this problem and was curious if you've come across any novel solutions.

qrsssh commented 2 months ago

@matthew-wei I am also grappling with this problem and was curious if you've come across any novel solutions.

How did you solve it, please?

murilosimao commented 2 months ago

@matthew-wei I am also grappling with this problem and was curious if you've come across any novel solutions.

https://github.com/microsoft/unilm/issues/1274