LibrePDF / OpenPDF

OpenPDF is a free Java library for creating and editing PDF files, with a LGPL and MPL open source license. OpenPDF is based on a fork of iText. We welcome contributions from other developers. Please feel free to submit pull-requests and bugreports to this GitHub repository.
Other
3.58k stars 589 forks source link

Validation error is caused by the use of item /ITXT inside PDF file #303

Closed justinasbardauskas closed 4 years ago

justinasbardauskas commented 4 years ago

Hello,

We use OpenPDF (1.3.11) with JasperReports to create PDF files, which will be sent to document signing third-party provider. The problem is that PDF contains invalid "/ITXT" which causes the failure, because it is not whitelisted for use in Denmark . Currently I'm using PdfBox as a workaround to remove it.

The cause of this issue is:

Currently I see few possible solutions:

  1. Remove it entirely. Please correct me if I'm wrong, but I do not see any benefit in having this item.
  2. Introduce mechanism or some parameters that would allow to control it.
  3. Move this item to more appropriate place e.g. as some metadata about document if PDF fails has such option.

Steps to reproduce:

  1. Create simple PDF file e.g. use hello world example.
  2. Validate PDF file using PDF validator provided here.
mkl-public commented 4 years ago

The problem is that PDF contains invalid "/ITXT"

ITXT is a valid name.

Nonetheless there was a recent discussion to replace it because it is linked to the original developer of the iText library, see https://github.com/LibrePDF/OpenPDF/issues/261 ... perhaps your issue pushes that discussion somewhat further

it is not whitelisted for use in Denmark

You might want to explain this. Who white-lists PDF names in which contexts and for which use cases in Denmark?

The iText library versions using this marker are used in many programs, surely also in Denmark, not caring about any "white-listing"...


Ok, I looked around on that https://www.nets.eu/dk-da/kundeservice/nemid-tjenesteudbyder/The-NemID-service-provider-package/ page. Their way to restrict the PDF format for files they support to sign by white-listing PDF names without context is extremely weird.

Nonetheless, to allow OpenPDF to create NemID supported PDFs it does not suffice to replace the ITXT key by one registered to OpenPdf as discussed in https://github.com/LibrePDF/OpenPDF/issues/261 , the entry must be fully removed or use a white-listed key, i.e. a key meant for something else.

justinasbardauskas commented 4 years ago

@mkl-public Thanks for your effort.
As I understand, they whitelisted items to ensue rule "What You See Is What You Sign".

the entry must be fully removed or use a white-listed key, i.e. a key meant for something else.

Removal is only a last resort option and there is always backward-compatibility question. In my opinion OpenPDF requires new feature which would allow to exclude/disable items, meaning configured items wont be included in to PDF e.g. openpdf.config.pdf.item.exclude=ITXT,SOME_OTHER_NAME,... OpenPDF would benefit greatly if it had configuration layer, which could be used to implement PDF items management.

My proposal would be of two parts:

  1. Implement configuration layer, which can be similar to JasperReports, where it uses JasperReportsContext which is passed as mandatory parameter (e.g. to the constructor) all over the code. It provide access to the configurations. It would make sense to have configuration resources for default configs - openpdf.default.properties (or openpdf.default.yml) and for overridden configs - openpdf.properties( or openpdf.yml).
  2. Based on it implement PDF items management. For example, could be defined config property openpdf.config.pdf.item.exclude which could be used inside PdfPages like:
    ContextConfig config = ...
    ...
    else {
    List<String> excludedItems = config.getProperty("openpdf.config.pdf.item.exclude");
    if(!excludedItems.contains(PdfName.ITXT.getName())){
      top.put(PdfName.ITXT, new PdfString(Document.getRelease()));
    }
    // Maybe would make more sense to move exclusion checking to `PdfDictionary` 
    // class itself.
    }
    ...

    So having this feature everyone could configure OpenPDF to their needs.

I would like to get more opinions (pros and cons) on this topic and maybe we will agree on this (or similar) feature.

mkl-public commented 4 years ago

As I understand, they whitelisted items to ensue rule "What You See Is What You Sign".

Yes, but by keeping it simple for themselves and ignoring the context in which a name is used, they forbade much too much.

Interestingly enough, they even effectively forbade PAdES signatures (if the provided white-listing is still current and complete), ensuring non-interoperability with standard European PDF signatures. I wonder whether this was actually done by design...

My proposal would be...

Adding configurable options is ok. I merely wouldn't link options as much to names as your proposal description (not your proposed mechanism itself, though) implies. I'd more think of abstract identifiers here.