metanorma / mn2pdf

Metanorma XML to PDF
3 stars 3 forks source link

Use File Attachment annotation for the link to the embedded file #267

Open Intelligent2013 opened 2 weeks ago

Intelligent2013 commented 2 weeks ago

Source issue: https://github.com/metanorma/metanorma/issues/407#issuecomment-2321962832

For TC4, the type of PDF annotation generated is incorrect.

It is currently a Link annotation (/Subtype /Link supporting a URI via the /A (Action) key) whereas it should likely be a File Attachment annotation (/Subtype /FileAttachment with /FS entry to an embedded file object). Although technically an Embedded GoTo action is possible using the current Link annotation (by changing the /A' entry), I'm not sure how widely supported that feature is in browser-based PDF viewers and the less widely used PDF viewers.

This is how the Apache FOP is working (the source code https://github.com/apache/xmlgraphics-fop/blob/c11f43c9dbf529b87820d77ef501aa10699fc9d8/fop-core/src/main/java/org/apache/fop/pdf/PDFFactory.java#L699) Apache FOP add JavaScript action to open the embedded file:

PDFBox PDFDebugger: image

Adobe Acrobat: image

From PDF Reference 1.7: image

image

To do:

Intelligent2013 commented 3 days ago

The FileAttachment annotation is working differently than text with hyperlink with Javascript function (current):

Example from ISO_32000-2_sponsored-ec2.pdf with two 'paper clips' links: image

Another very simple PDF with /Subtype /FileAttachment: structure-attached.pdf (found on GH somewhere, opening in the Adobe Acrobat only, not well-formed PDF)

Additional tasks:

Intelligent2013 commented 1 day ago

Current progress, proof-of-concept example: test_attachments.tc4.presentation.pdf

  1. the Adobe Acrobat shows the message: image

The double-click on the 'Paperclip' icon doesn't work until the user click on the button 'Enable editing'.

  1. The 'Attachment' tab contains two files - attached file and 2nd one shows when annotation is created. I'll investigate how to omit the second one.

image

Note: We can't generate PDF/A-3 for BIPM brochure, because the link to the embedded .mml file realized in the Apache FOP via JavaScript function, that doesn't allow by PDF/A-3.

petervwyatt commented 1 day ago

That last file is a special file from Ange (a file format expert who experiments with minimal files)...

Other PDF files with file attachments on the public web:

petervwyatt commented 16 hours ago

@Intelligent2013

The double-click on the 'Paperclip' icon doesn't work until the user click on the button 'Enable editing'.

Correct. That is a PDF/A formal requirement to ensure that any user interaction (such as extracting an embedded file) cannot / does not mess with page content. This is not a bug :-)