Closed matthopson closed 3 years ago
Just adding some additional thoughts here: In viewers like pdf.js, CropBox is commonly used to display the contents of a PDF - so in our case, those are the dimension I expect to work from for the page.
I'm wondering if we should consider introducing a new method like: page.getCropSize()
- or perhaps a flag on page.getSize()
- where your intentions are to get visible working dimension for a given page... just a thought, and again, this may be handled in some other way that I'm missing.
Thanks!
In addition to https://pdf-lib.js.org/docs/api/classes/pdfpage#getsize the following methods are available:
The https://pdf-lib.js.org/docs/api/classes/pdfpage#setsize API docs explain a bit about the various boxes. It also has some internal logic that attempts to set the correct boxes automagically: https://github.com/Hopding/pdf-lib/blob/e10290ac8503fd5b21f2d895252835e00a1f16b6/src/api/PDFPage.ts#L189-L218
Perhaps additional logic could be added to https://pdf-lib.js.org/docs/api/classes/pdfpage#getsize to make it do the expected thing automatically (in most cases) and provide better docs for when it does not. Added it to the roadmap: https://github.com/Hopding/pdf-lib/discussions/998.
I know this issue was closed a long time ago, but I brought up a case of inconsistency when retrieving MediaBox and CropBox information from a PDF page.
I have a PDF whose first page has different box information than the pages. However, when I retrieve this information using pdfinfo, I get information that differs from the information that pdf-lib gives me.
I drew some circles using the CropBox information and this is how the 2 tests turned out. The first printout was using the pdfinfo information. The second printout was using the information that pdf-lib gives me through the getCropBox() method.
How is this MediaBox and CropBox information obtained in pdf-lib?
The example PDF is below:
We encountered a situation where pdf dimensions didn't seem to be accurately represented. It only seemed to happen with a couple of specific PDFs which, unfortunately, I can't share here - but I can tell you how to recreate it.
In our situation, we are applying content onto a pdf and saving it, using some coordinates given by a frontend UI (think drag and drop).
The PDFs we were seeing issues with were putting content far offscreen (we thought at first it wasn't being applied at all). When looking at the PDF dimensions in Mac's file info util, we'd see dimensions of
659 × 790
. When using an Apache tool to inspect the document, we found that the document showed a MediaBox value of1217.01 x 790.729
, and a CropBox width of659.121
- which lined up with what Mac's file inspector was saying. Now, of course, we know the width that was being computed in our script was the 1217 width, but it was being displayed in that 659 width.What we ended up doing to address this was using the
page.getCropBox()
method to see if it had a width, then subtracting that from the MediaBox width (page.getSize()
) to get an offset to apply to our xPos for our content overlay.To write tests for this, we're using the
page.setMediaBox()
andpage.setCropBox()
methods to reproduce this condition in the pdf metadata.This is working, but it definitely makes me nervous. So my question is, have you encountered this issue, and if so, what's the best way to handle it?
Thanks again!