axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
Apache License 2.0
5.86k stars 311 forks source link

Cannot detect table #637

Open NgoDuyVu1993 opened 2 years ago

NgoDuyVu1993 commented 2 years ago

Summary Hi, thanks so much for the great tool. I tried Parsr to detect heading and table within a document. However, in the last page there is a big table, but Parsr cannot detect it. Instead, it only gave text in column as paragaph the result in json file, while it can detect table from other pages. I used pdf editor to exam the table and found no different, is there a way I can fix this or work around it? Thanks Developer team Here is the file that I run 835-EGM-00-SWI-120.pdf

Steps To Reproduce Steps to reproduce the behavior:

  1. Go to upload the table to Parsr server
  2. Get the json result
  3. Check each element in the json to see Parsr can detect
  4. See that some table cannot be detected

Screenshots If applicable, add screenshots to help explain your problem.

Original Document image

Result after detect: image

Additional context Here is the table that it can detect Original image

Result image