robertknight / ocrs-models

PyTorch models for the ocrs OCR engine
54 stars 8 forks source link

is Layout analysis trainable and works? #32

Open josef821 opened 1 month ago

josef821 commented 1 month ago

hi. i want to train and use your Layout analysis. does it working? have you any output of that? thanks.

robertknight commented 1 month ago

The layout analysis model is a non-functional experiment at present. The rough idea will be to implement something like https://arxiv.org/abs/2203.09638.

Depending on your needs, a solution in the interim may be to treat layout analysis as problem of detecting objects in images. There are various pre-trained models that exist for this, plus various resources for training your own. See https://huggingface.co/Oblix/yolov10m-doclaynet_ONNX_document-layout-analysis for example. The underlying inference engine that Ocrs uses can also run many of these models (including YOLO). There are some object detection examples here.