pdfcpu / pdfcpu

A PDF processor written in Go.
http://pdfcpu.io/
Apache License 2.0
6.8k stars 466 forks source link

Crop page to contents. #958

Open juanpmarin opened 4 days ago

juanpmarin commented 4 days ago

Hi there!

I'm trying to crop a page so that it fits its visible contents. I've been exploring the current API extensively, but I haven't found a way to achieve this yet.

For example, if I have a PDF with a page like this, I would expect the page to be cropped to fit only the area within the red square.

Thank you for your help!

image

hhrutter commented 4 days ago

This can't be done without knowing the bounding box of your visible content. Remember there may also be invisible content. Once you know the bounding box you can set your cropBox accordingly using the box command on the CLI eg.

juanpmarin commented 3 days ago

@hhrutter thanks for your answer. I've debugged the pdf for my usecase and all the content is visible. Can you give me some clues about how to measure the bounding box of the content please?

hhrutter commented 3 days ago

If I had that then there would already be a crop content option..

It's tricky, because it encompasses all drawing operations of primitives, XObjects plus you have to figure out the visibility for each and if its bounding box contributes to the overall page content bounding box.