ArtifexSoftware / Ghostscript.NET

Ghostscript.NET - managed wrapper around the Ghostscript library (32-bit & 64-bit)
https://ghostscript.com
GNU Affero General Public License v3.0
391 stars 152 forks source link

Ghostscript - completely REMOVE METADATA from pdf files 2 #117

Open Geo-Van opened 11 months ago

Geo-Van commented 11 months ago

Hello, This post is similar with https://github.com/ArtifexSoftware/Ghostscript.NET/issues/114 BUT, it is modified according to new understanding after having examine and get advices from many experts. The reason we post it to here, is to confirm that we understand the matter correct, because we are newbies to Ghostscript. So we simplify the whole matter to the following, and we will appreciate your comments:

I have a pdf file (named input.pdf) and i want to convert it to a new pdf file (named output.pdf) using Ghostscript. The reason/purpose i want to convert it is only one - to remove all its old metadata (classical and xmp). So, i apply the command:

gsc.exe -o output.pdf -sDEVICE=pdfwrite input.pdf pdfmark.txt

With the following pdfmark.txt: [ /Title () /Author () /Subject () /Creator () /ModDate () /Producer () /Keywords () /CreationDate () /DOCINFO pdfmark

Please, very kindly i ask just to confirm that the above command will do the job i want that is : It will remove all its old metadata classical and xmp and the newly created file will have its new metadata but there will be no trace of the old file metadata. (I know that the only exception is that the Producer name will be Ghostscript, and the Creation Date will be applied during the creation of the converted file and can not be changed), - and we are happy with this.

We will appreciate very much your comments regarding if the above command will do the job we want. Thank you very much!

jamie-lemon commented 11 months ago

Hi there,

I think you could run the command on the PDF you want then try a service to quickly check the file metadata - e.g. https://app.pdf.co/request-tester ( you'll have to create account first, but its easy and you get free credits ), just run the pdf info command on the file , see: https://apidocs.pdf.co/02-pdf-info-reader and it will tell you what it sees.

Hope this helps.

Geo-Van commented 11 months ago

Thank you. I try it with EXIFTOOL, and it seems that it do the job - there are no metadata seen from the old pdf. So, i think that the above Ghostscript command do the job we want. Any other comments, please regarding the Ghostscript command used? Will the command do the job we want?