Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.64k stars 449 forks source link

inconsistent num classes in PrimaLayout #176

Open bertsky opened 1 year ago

bertsky commented 1 year ago

The configuration for the PrimaLayout MRCNN Detectron2 model …

https://github.com/Layout-Parser/layout-parser/blob/04e28168d820eea3a1ff1e098078323e7b48648b/src/layoutparser/models/detectron2/catalog.py#L57

… contains a setting MODEL.ROI_HEADS.NUM_CLASSES: 7.

But LP's actual runtime config for the label map of the region types contains only 6 classes:

https://github.com/Layout-Parser/layout-parser/blob/04e28168d820eea3a1ff1e098078323e7b48648b/src/layoutparser/models/detectron2/catalog.py#L88-L95

According to the converter for training on PRImA … https://github.com/Layout-Parser/layout-model-training/blob/b9fad076596272e427612d5e848da1ba8ea06b97/tools/convert_prima_to_coco.py#L36-L64

… the difference is in the background class (0).

IMO that's an error (either in the Detectron2 config file, or in LP catalog's label map). It does not show as an error, but instead of class strings you'll just get the class number for the additional/missing class 0.