veraPDF / veraPDF-library

Industry supported, open source PDF/A validation library
http://verapdf.org/software
GNU General Public License v3.0
270 stars 48 forks source link

New version throws error because of QR code #1387

Closed cristalp closed 9 months ago

cristalp commented 9 months ago

We generate invoices with QR codes, which work in Switzerland for payments. These QR codes are generated using SwissQRBill . However, since upgrading from veraPDF 1.20.1 to 1.24.1 we get errors when validating our PDFs.

Those PDFs must conform to PDF/A-2b within our company, and this is also what we use veraPDF for (and thank you a lot for the library!).

The error we get is: TestAssertion [ruleId=RuleId [specification=ISO 19005-2:2011, clause=6.1.7.1, testNumber=1], status=failed, message=The value of the Length key specified in the stream dictionary shall match the number of bytes in the file following the LINE FEED (0Ah) character after the stream keyword and preceding the EOL marker before the endstream keyword., location=Location [level=CosDocument, context=root/indirectObjects[0](16 0)/directObject[0]], locationContext=null, errorMessage=null]

I have no idea what this means :-) Perhaps you could explain what the problem is?

Since I can't upload actual invoices to GitHub, I created a fake invoice using https://www.codecrete.net/qrbill/ and then converted it to a PDF/A-2b using https://tools.pdf24.org/en/pdf-to-pdfa Note: In our software, we do the conversion using Spire.PDF for Java, but error is also shown with the online PDF converter.

Strangely though, if I validate that test PDF using your REST Client, it passes as valid PDF/A-2b.

I would be very thankful for any hints regarding the QR codes: Are they faulty? Or is veraPDF reporting that error even though it shouldn't? How do the validations differ between the Java library and the online validator?

Thanks! qrbill-2b.pdf

bdoubrov commented 9 months ago

@cristalp the attached file has no validation errors in 1.24.1 if we use any of the tools: CLI, GUI, Rest (same as demo.verapdf.org), code samples from https://docs.verapdf.org/develop/ .

So, I think the PDFs you generate are perfectly OK. I wonder which code you use for validation? Another possibility is that the files gets corrupt in some communication protocols, for example, if PDF file is treated as ASCII instead of Binary file.

cristalp commented 9 months ago

Hmmm... interesting! So I rewrote my test to be as simple as possible:

public class QrCodeTest {

  static {
    VeraGreenfieldFoundryProvider.initialise();
  }

  @Test
  public void testValidation() throws IOException, ModelParsingException, EncryptedPdfException, ValidationException {
    final String pdf = "/your/path/to/test/files/qrbill-2b.pdf";
    try (VeraPDFFoundry foundriesInstance = Foundries.defaultInstance();
        InputStream inputStream = new FileInputStream(pdf)) {
      final PDFAFlavour flavour = PDFAFlavour.PDFA_2_B;
      try (PDFAParser parser = foundriesInstance.createParser(inputStream, flavour)) {
        try (PDFAValidator validator = foundriesInstance.createValidator(flavour, false)) {
          final ValidationResult result = validator.validate(parser);
          if (!result.isCompliant()) {
            for (final TestAssertion assertion : result.getTestAssertions()) {
              if (assertion.getStatus() == Status.FAILED) {
                final String message = assertion.getErrorMessage() + assertion.getMessage();
                fail(message);
              }
            }
          }
        }
      }
    }
  }

}

And this codes passes! So obviously, the error must be in my software. We also have PDFBox for validation and an abstraction to choose between the validation engines. The error must be there.

Thanks for looking into this and sorry for wasting your time!

cristalp commented 9 months ago

I forgot to mention: Please close the issue! :-)