OCR PDF Attachments? - Githubissues

Not currently, and it's not planned any time soon, but I think you're second or third person to ask so there's some demand anyway. (See also #197)

I made some notes about how to go about doing this, whether it's useful to you for me as reference when I implement it:

Recently Ghostscript added PDF/A-3 so it's possible within Ghostscript. The current solution would be to modify the pdfmark file, named pdfa.ps, generated by ocrmypdf/pdfa.py, to include a step to embed the file insert according to the pdfmark specification: – see page 30, for the /EMBED command and this Ghostscript bug for a functioning example. Use absolute paths.

A better option would be to teach pikepdf how to embed files according to reference manual section 7.11.4, since this is would work without Ghostscript. OCRmyPDF will add pikepdf as dependency soon (I maintain both).

If you're able to do a PR for either I'd be happy to accept.

ocrmypdf / OCRmyPDF

OCR PDF Attachments? #259