nlmatics / nlm-ingestor

This repo provides the server side code for llmsherpa API to connect. It includes parsers for various file formats.
https://www.nlmatics.com
Apache License 2.0
923 stars 112 forks source link

Issue with finding tables and sections #45

Open Aviral-tech opened 3 months ago

Aviral-tech commented 3 months ago

EOL Notice (11).pdf I have a PDF file as attached above, the parse is not able to recognisze the tables inside of it. And at the same time it is only showing me that there are only two sections in this named as: Product Discontinuance Notice - PDN 23_0061 Rev. - PDN Title:

I have also tried this using OCR=yes and at the same time I have also used NewIndentParser=True. If anyone knows why this is happening, do help me out.