Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.75k stars 456 forks source link

Question: Do you have plans to train a more light weight model? #82

Closed de-code closed 2 years ago

de-code commented 2 years ago

Motivation

(Apologies if this is not the right place to ask questions) The faster_rcnn_R_50_FPN_3x (PubLayNet) seems to be quite slow on a CPU. Locally it's around 3 seconds per image. In Google Colab it's more than 6 seconds. (It's around 350 ms with a GPU though). Something that would make this work on a CPU at a more reasonable speed could make it more "accessible". It would also make the download of the model, and PyTorch itself smaller. I was wondering whether you have any plans to train a smaller model on one of the related datasets?

Related resources Something like YOLOv5s perhaps?

Additional context n/a

lolipopshock commented 2 years ago

Thanks for bringing this up -- yeah, we do have plans to extend to more models. And in the latest updates, we've added models based on EfficientDet, which I've found is 90% faster on CPUs, has smaller model weights, and is much easier to install. Would you like to have a try -- lp.AutoLayoutModel("lp://efficientdet/PubLayNet") ?

PS: this requires you to upgrade lp to the latest version: pip install -U layoutparser[effdet].

de-code commented 2 years ago

That is perfect thank you. I should have tried to explore the available models more. It is indeed much faster and usable (got around 220 ms). I haven't seen any noticeable reduction in performance for my current use-case (detecting figures, although just looked at one document so far).

vikhil0609 commented 1 year ago

AutoLayoutModel("lp://efficientdet/PubLayNet") This can be used for lp.Detectron2LayoutModel also ? I have trained a custom model I am trying to increase speed for predicting layout detection ? Need some help it would be glad someone who knows better can help .