Closed femifrak closed 9 years ago
No. PDFTK does not preserve PDF/A status. You can use Ghostscript with the -dPDFA (?) switch to merge and to create PDF/A. The options to get a PDFA are very fussy and Ghostscript's error messages are obtuse. Use the exact same command line ocrmypdf does, with the same order.
I believe to get a PDFA with metadata embedded you can use PDFTK to add metadata then Ghostscript to get the PDFA - but you might have to write a little postscript stub file that contains the metadata segment and merge it with the PDFA. The development version of ocrmypdf sort of does this right now. On Sat, May 30, 2015 at 03:01 femifrak notifications@github.com wrote:
I like ocrmypdf very much. But i have one concern when merging several pdfs generated by ocrmypdf. Each single pdf is in pdf/a format, indicated by acrobat reader with a blue bar at the top.
For merging i use either pdftk pdf cat output all.pdf or gs -dBATCH -dNOPAUSE -sPAPERSIZE=A4 -sDEVICE=pdfwrite -sOutputFile="all.pdf" $(ls .pdf)
Is the merged pdf still in pdf/a format? acrobat does not show the blue bar any more.
The blue bar also disappears when i change the meta data: pdftk in.pdf dump_data output info.txt edit info.txt pdftk in.pdf update_info info.txt output out.pdf
Although in.pdf was pdf/a, i don't know whether out.pdf is follows the pdf/a convention.
Can someone give me a hint on that? And if its not pdf/a, how can i transform it to pdf/a?
Thanks a lot,
Femi
— Reply to this email directly or view it on GitHub https://github.com/fritz-hh/OCRmyPDF/issues/108.
This problem with metadata being dropped by OCRmyPDF has been fixed in v3.0-rc2.
I like ocrmypdf very much. But i have one concern when merging several pdfs generated by ocrmypdf. Each single pdf is in pdf/a format, indicated by acrobat reader with a blue bar at the top.
For merging i use either pdftk pdf cat output all.pdf or gs -dBATCH -dNOPAUSE -sPAPERSIZE=A4 -sDEVICE=pdfwrite -sOutputFile="all.pdf" $(ls .pdf)
Is the merged pdf still in pdf/a format? acrobat does not show the blue bar any more.
The blue bar also disappears when i change the meta data: pdftk in.pdf dump_data output info.txt edit info.txt pdftk in.pdf update_info info.txt output out.pdf
Although in.pdf was pdf/a, i don't know whether out.pdf is follows the pdf/a convention.
Can someone give me a hint on that? And if its not pdf/a, how can i transform it to pdf/a?
Thanks a lot,
Femi