I'm trying to round-trip from pdfminer server side text positions into pdf.js. Is this something you'd expect to be possible? Using the compressed.tracemonkey-pldi-09.pdf file the first block of text "Trace-based Just-in-Time Type Specialization for Dynamic" has the correct x1, x2, width and height but y1 and y2 are offset slightly.
I am not sure which library is 'correct' (or indeed if both can be correct, and this is subject to interpretation?).
If this isn't reliable across text blocks, are there other ways to safely roundtrip references to text segments in the PDF?
I'm trying to round-trip from pdfminer server side text positions into pdf.js. Is this something you'd expect to be possible? Using the compressed.tracemonkey-pldi-09.pdf file the first block of text "Trace-based Just-in-Time Type Specialization for Dynamic" has the correct x1, x2, width and height but y1 and y2 are offset slightly.
I am not sure which library is 'correct' (or indeed if both can be correct, and this is subject to interpretation?).
If this isn't reliable across text blocks, are there other ways to safely roundtrip references to text segments in the PDF?
pdfjs:
pdfminer:
Attach (recommended) or Link to PDF file here
https://github.com/mozilla/pdf.js/blob/master/web/compressed.tracemonkey-pldi-09.pdf
Configuration:
Steps to reproduce the problem:
What is the expected behavior? (add screenshot)
x0,y0,x1,y1 match across libraries on same text block.
What went wrong? (add screenshot)
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):