Improper results on scanned pdfs

I have been trying to analyze the documents using layout parser on different types of documents, I am able to get expected results on True pdfs but not on scanned pdfs, it is detecting the scanned pdf image contents as figure or not as expected results.

I am facing this issue only for the scanned pdfs

Checklist

I have searched related issues but cannot get the expected help.
The bug has not been fixed in the latest version, see the Layout Parser Releases

To Reproduce

import layoutparser as lp import cv2

image = cv2.imread("test.png") image = image[..., ::-1]

model = lp.models.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8], label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})

color_map = { 'Text': 'red', 'Title': 'blue', 'List': 'green', 'Table': 'purple', 'Figure': 'pink', }

layout = model.detect(image)

lp.draw_box(image, layout, box_width=3,color_map=color_map)

Environment

I am using windows
Latest layout parser version

Contains 2 images:

1: Scanned pdf image result 2: Proper pdf image result error positive

Layout-Parser / layout-parser

Improper results on scanned pdfs #193