Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.75k stars 456 forks source link

Detecting graphs and figures in the PDF images #102

Open qwertynik opened 2 years ago

qwertynik commented 2 years ago

Thanks for building this library.

Used this code to detect if an image contains graphs and charts.

layout = 'lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config'
model = lp.Detectron2LayoutModel(layout,
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3: "Table", 4: "Figure"})

image = lp.draw_box(image, text_blocks, box_width=3, show_element_id=True, show_element_type=True)

For most cases, the charts/graphs are marked as Figures

fileoutpart12 fileoutpart13

However, there are some anomalies.

  1. Multiple sections in the same chart are marked as Figures fileoutpart18

  2. Only a partial section in the chart is marked as a Figure. fileoutpart11

Are there any other models that can be used to detect charts/graphs more effectively? If not, any ideas on how to create and train a custom model for improved detection? More detailed the steps, the better - a python beginner here.

Thanks!