Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.67k stars 449 forks source link

layout parser does not work well when try with diverse layout of PDFs #148

Open mmuzammil45 opened 1 year ago

mmuzammil45 commented 1 year ago

I'm working on a set of PDFs that have different page layouts i.e. multi-column/single column+images+figures+tables. It's giving the below average results though I have tried different provided models for it.

  1. Can anyone suggest me how to achieve better results with such a diverse document pages?
  2. Also,the most important, I need the reading order in the return Layout variable of results. How can I get this? Example given below:

Layout(_blocks=[TextBlock(block=Rectangle(x_1=201.03945922851562, y_1=413.36480712890625, x_2=1500.326904296875, y_2=1290.304931640625), text=None, id=None, type=Figure, parent=None, next=None, score=0.9502648115158081), TextBlock(block=Rectangle(x_1=174.8553466796875, y_1=270.81329345703125, x_2=1229.29443359375, y_2=416.44305419921875), text=None, id=None, type=Title, parent=None, next=None, score=0.9470152258872986), TextBlock(block=Rectangle(x_1=200.6973419189453, y_1=489.5100402832031, x_2=560.0352172851562, y_2=518.6799926757812), text=None, id=None, type=Text, parent=None, next=None, score=0.8652349710464478), TextBlock(block=Rectangle(x_1=260.79986572265625, y_1=1346.680419921875, x_2=1495.7305908203125, y_2=1452.4434814453125), text=None, id=None, type=Text, parent=None, next=None, score=0.8538650870323181)], page_data={}) Let me know if anyone has any suggestion/solution to improve it. Thanks a lot.

Environment

  1. Linux
  2. layout parser version: 0.3.4
  3. I followed the official documentation.

Screenshots sample14

page63

henrivanoost1 commented 1 year ago

Hi there,

I don't know how to fix this but did you already tried the cloud vision api of google? I think it's maybe an interesting solution for you.