internetarchive / archive-pdf-tools

Fast PDF generation and compression. Deals with millions of pages daily.
https://archive-pdf-tools.readthedocs.io/en/latest/
GNU Affero General Public License v3.0
101 stars 14 forks source link

Wrong resolution of mask image when foreground image is downsampled #59

Open JoeLoginIsAlreadyTaken opened 1 year ago

JoeLoginIsAlreadyTaken commented 1 year ago

I tried to use recode_pdf from imagestack together with the option to downsample the foreground ("--fg-downsample 4"). The resulting pdf was unreadable.

I found out that the foreground (meaning the color layer) was resampled as expected. When the pdf is written, the resolution of the mask layer (which should stay in the original size) is taken from the foreground and therefor wrong.

As a solution i changed mrc.py to return the size of the mask and used the values from recode.py

This works fine when encoding images to pdf. I did not test ist with other modes.

Attached you find patches for mrc.py and recode.py. patches.tar.gz

MerlijnWajer commented 1 year ago

This seems like a good find, thank you, I will try to get to reviewing and merging this either this week or the next one.