layoutparser doens't work well for a very well-structured CV

ttbuffey commented 2 years ago

Describe the bug layoutparser doens;t work well for a very well-structured CV, Am I using layoutparser in the wrong way? could you please help to check? Thanks very much.

To Reproduce

import layoutparser as lp
import cv2
import ssl
import warnings
ssl._create_default_https_context = ssl._create_unverified_context
warnings.filterwarnings('ignore')

image = cv2.imread("data/25.png")
image = image[..., ::-1]
model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
layout = model.detect(image)
print(layout)
    # Detect the layout of the input image
lp.draw_box(image, layout, box_width=3).show()

Environment

macos
use below command to install layoutparser
- pip install layoutparser torchvision && pip install "detectron2@git+https://github.com/facebookresearch/detectron2.git@v0.5#egg=detectron2"
- Python 3.9.1

Screenshots If applicable, add screenshots to help explain your problem.

ruben-as-teixeira commented 2 years ago

I'm facing the same kind of difficulties. When applying to CVs, the results are very poor.

Bergrebell commented 2 years ago

have you tried working with different models? PrimaLayout for example gives me quite better results on a similar set of documents.

model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})

but they are still not perfect (that's) why i came here ;) - are there any options to tweak the text detection?

Layout-Parser / layout-parser

layoutparser doens't work well for a very well-structured CV #103