No text detected in pdf

rafaeldepablo commented 2 years ago

SABADELL_GOBIERNO_CORPORATIVO_2022.pdf Summary great software

I'm running into strange behavior on some pdfs, apparently it's not finding any text except on the first sheet.

The pdf files are normal, it is possible to copy the text and search.

Instead if it finds the tables even though the text is blank.

Steps To Reproduce

Load the pdf and try

Expected behavior The text is processed

Actual behavior No text is identified

Screenshots

Environment

sudo docker run -p 3001:3001 axarev/parsr:latest

Thanks in advance

NgoDuyVu1993 commented 2 years ago

Hi @rafaeldepablo, Ignore my comment if you find it irrelevant. I am not in Parsr team, I have some problem with Table detection so I looked around to see if anyone have the same. I tried to run your document, Parsr can detect fine with your document.

I think you may missed something when you do the setting when you uploaded document. Here is how I configured

rafaeldepablo commented 2 years ago

Thanks

I tried again and it crashed, but I retried again and it worked.

Regards

rafa

axa-group / Parsr

No text detected in pdf #620