boazsegev / combine_pdf

A Pure ruby library to merge PDF files, number pages and maybe more...
MIT License
734 stars 155 forks source link

Question - Can I save an Accessible PDF without losing the accessibility details such as Lang: #142

Closed nacarp closed 6 years ago

nacarp commented 6 years ago

Can I save an Accessible PDF without losing the accessibility details such as Lang?

Since I can't create an Accessible PDF in Prawn, I hope to append a Text/Table only PDF to an existing empty PDF that was saved in Acrobat as Accessible and with a default Language.

With content options override on, will all the properties of the original PDF be carried over? As this is pure Ruby, as others have said, I will be interested in contributing to this project.

Thank you, Neil.

boazsegev commented 6 years ago

Hi @nacarp ,

Thank you for your question and your offer to contribute.

Can I save an Accessible PDF without losing the accessibility details such as Lang?

The short answer to your question would be "probably not".

The "accessibility" features aren't really supported. Some of the features are linked to page data and some of them are linked to the file itself, which means that the file associated features can't be "copied" along with any specific page (breaking the "combine" feature when combining pages).

With content options override on, will all the properties of the original PDF be carried over?

The code is very flexible, especially when combining full PDF objects (vs. pages).

The :allow_optional_content option forces the code to make best guess adjustments when loading and merging PDFs. However, the data might be dropped (in whole or in part) due to the difficulties related to the "optional content" feature.

It's quite possible that the code's best attempt to preserve data will be good enough to transfer the "accessibility" features in PDF files. This is probably more likely when the PDF containing these features is the PDF target rather than the PDF source.

Having said that, it would be a "case by case" occurrence that might work with some PDF files but not others.

Again, officially CombinePDF doesn't support optional content PDFs.

Kindly, Bo.