Open jervispinto opened 12 years ago
Hi @jervispinto -
What do you mean by verifiable error in bounding boxes? I'm going to be looking at the code for the bounding boxes soon as I noticed the Y coords are coming from the bottom of the document and not the top. The xcoord are correct
I'm attempting to recreate the error as it's been a while since I looked at this. I checked the xml again and the missing text seems to be in the xml with (as far as I can tell) correct bounding boxes but the text disappears during parsing. This may be an issue in my parsing logic so I'll double check over the weekend.
The Y coordinates are certainly decreasing with gravity.
Using the command: python pdf2txt.py -t xml -A
produces a verifiable error in bounding boxes.
(Please email me for the pdf)