Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.83k stars 464 forks source link

Use ONNX models to avoid installing Detectron2 #57

Open BobLd opened 3 years ago

BobLd commented 3 years ago

Motivation In order to ease the installation for Windows users (i.e. avoid installing Detectron2 to use pre-trained models), why not converting the Detectron2 models to ONNX for use? It would also allow using your trained models from other laguages, e.g. C#/.Net. The model converted was also smaller - half the size (from 816MB for the .pth to 408MB for the .onnx)

Related resources I've created a repos here with a simple PoC notebook explaining how to convert the Detectron2 model into ONNX and use the ONNX model (model used was mask_rcnn_X_101_32x8d_FPN_3x).

It uses the export_model.py tool available in the detectron2 repos here

I managed to convert the model using the following command:

python export_model.py --sample-image ...\layout-parser\data\foo.0_raw.png --config-file .../layout-parser/models/PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/config.yaml --output ./output --export-method caffe2_tracing --format onnx MODEL.WEIGHTS .../layout-parser/models/PubLayNet/mask_rcnn_X_101_32x8d_FPN_3x/model_final.pth MODEL.DEVICE cpu

Additional context Difference between the original model and exported model would need to be understood as the conversion might not implement every post-processing steps

lolipopshock commented 3 years ago

Thanks - that's a good point. I've tried with Detectron2 to ONNX or torchscript earlier this year, and the conversion is not great - as you said, a lot of postprocessing steps is required in Detectron2, and this broke the code when I tried. My current work is focused on supporting multiple backends for the models, some of which are way easier to install. Nevertheless, I think it's still worth working a bit more in this direction (ONNX models) in the long term. I will get back to this issue probably in Q4 this year? And in that time, we can figure out a conversion strategy that works for multiple backends and further simplify the installation?

(PS: unfortunately I could not check your repo as it's private...)

BobLd commented 3 years ago

Sounds good! I've made the repo public, let me know if you have any questions

EDIT: It seems that the output differences between the 2 models comes from the fact that the image used in the ONNX model is resized. If the image inputed in the Detectron2LayoutModel is also resized, the outputs seem to be the same.

trand2k commented 4 months ago

Sounds good! I've made the repo public, let me know if you have any questions

EDIT: It seems that the output differences between the 2 models comes from the fact that the image used in the ONNX model is resized. If the image inputed in the Detectron2LayoutModel is also resized, the outputs seem to be the same.

Hi author, can you share with me detecttron onnx model

mhrrs commented 1 month ago

Sounds good! I've made the repo public, let me know if you have any questions EDIT: It seems that the output differences between the 2 models comes from the fact that the image used in the ONNX model is resized. If the image inputed in the Detectron2LayoutModel is also resized, the outputs seem to be the same.

Hi author, can you share with me detecttron onnx model

@trand2k Any update on this?