pts / pdfsizeopt

PDF file size optimizer
GNU General Public License v2.0
751 stars 65 forks source link

Add ability to disable metadata #83

Open rbrito opened 6 years ago

rbrito commented 6 years ago

Hi, @pts.

I see that, in its current form, pdfsizeopt doesn't discard metadata that is embedded in a PDF document.

For instance, if I start with the file minimal.pdf and run Ghostscript's ps2pdf14 on it, I get the following file: minimal.pdf.pdf.

If I postprocess the resulting file above with pdfsizeopt, I get something like: minimal.pdf.pso.pdf which, when uncompressed with pdftk, gives the following: minimal.pdf.pso.unc.pdf.

If you inspect this later file, you can see that the XML-formatted metadata is there. It would be great to have this removed from the file with a run of pdfsizeopt.

Thanks,

Rogério Brito.

pts commented 6 years ago

Thank you for proposing this! I keep the issue open to track the development in this area. (Don't expect fast progress though.)

rbrito commented 6 years ago

Hi, Péter.

Nice that you like this suggestion of removing useless metadata... I use the following script to also remove some other information:

remove-pdf-metadata.sh.txt

Sometimes I run that before I run pdfsizeopt...

Cheers!

pts commented 6 years ago

Thank you sharing your shell script!

I've extracted the links within it for easier access: