ZUGFeRD / mustangproject

Open Source Java e-Invoicing library, validator and tool (Factur-X/ZUGFeRD, UNCEFACT/CII XRechnung)
http://www.mustangproject.org
Apache License 2.0
204 stars 114 forks source link

validator fails on PDF/A-3 file that VeraPDF claims is vaild #139

Open mbunkus opened 4 years ago

mbunkus commented 4 years ago

I'm looking into creating PDF/A files via LaTeX for use with Mustang's "combine PDF & XML" feature. Unfortunately the PDF/A file created by pdfLaTeX (via the pdfx package) cannot be used with Mustang as its validator rejects it:

[0 mbunkus@chai-latte ~/dl] java -jar ~/prog/mustangproject/target/mustang-1.7.5-SNAPSHOT.jar -c --source ~/dl/invoice-test.pdf --source-xml ~/dl/ZUGFeRD-invoice.xml --out ~/dl/zug.pdf --format zf --version 2 --profile W
Picked up _JAVA_OPTIONS: -Dawt.useSystemAAFontSettings=on -Dswing.aatext=true
Source PDF set to /home/mbunkus/dl/invoice-test.pdf
ZUGFeRD XML set to /home/mbunkus/dl/ZUGFeRD-invoice.xml
Ouput PDF set to /home/mbunkus/dl/zug.pdf
Format set to zf
Version set to 2
Profile set to W
Nov 05, 2019 11:30:18 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font ArialMT for base font Symbol
Nov 05, 2019 11:30:18 AM org.apache.pdfbox.pdmodel.font.PDType1Font <init>
WARNING: Using fallback font ArialMT for base font ZapfDingbats
Nov 05, 2019 11:30:18 AM org.mustangproject.toecount.Toecount performCombine
SEVERE: null
java.io.IOException: File is not a valid PDF/A input file
        at org.mustangproject.ZUGFeRD.ZUGFeRDExporterFromA3Factory.ensurePDFIsValidPDFA(ZUGFeRDExporterFromA3Factory.java:106)
        at org.mustangproject.ZUGFeRD.ZUGFeRDExporterFromA3Factory.load(ZUGFeRDExporterFromA3Factory.java:65)
        at org.mustangproject.toecount.Toecount.performCombine(Toecount.java:546)
        at org.mustangproject.toecount.Toecount.main(Toecount.java:327)

However, both VeraPDF & FoxIt PDF state that the file is indeed a valid PDF/A-3b file:

[0 mbunkus@chai-latte ~/dl] ~/opt/verapdf/verapdf invoice-test.pdf
Picked up _JAVA_OPTIONS: -Dawt.useSystemAAFontSettings=on -Dswing.aatext=true
<?xml version="1.0" encoding="utf-8"?>
<report>
  <buildInformation>
    <releaseDetails id="core" version="1.14.105" buildDate="2019-10-24T22:54:00+02:00"></releaseDetails>
    <releaseDetails id="validation-model" version="1.14.105" buildDate="2019-10-24T22:59:00+02:00"></releaseDetails>
    <releaseDetails id="gui" version="1.14.8" buildDate="2019-10-24T23:11:00+02:00"></releaseDetails>
  </buildInformation>
  <jobs>
    <job>
      <item size="45145">
        <name>/home/mbunkus/dl/invoice-test.pdf</name>
      </item>
      <validationReport profileName="PDF/A-3B validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true">
        <details passedRules="123" failedRules="0" passedChecks="2498" failedChecks="0"></details>
      </validationReport>
      <duration start="1572949343038" finish="1572949343486">00:00:00.448</duration>
    </job>
  </jobs>
  <batchSummary totalJobs="1" failedToParse="0" encrypted="0">
    <validationReports compliant="1" nonCompliant="0" failedJobs="0">1</validationReports>
    <featureReports failedJobs="0">0</featureReports>
    <repairReports failedJobs="0">0</repairReports>
    <duration start="1572949342961" finish="1572949343506">00:00:00.545</duration>
  </batchSummary>
</report>

I'm using Mustang revision c0b199a1bd76fbaa33539d8b189114b73e5ba991, but release 1.7.4. shows the same problem.

I've uploaded the PDF here.

jstaerk commented 4 years ago

Hi, my Verapdf 1.10.6 GUI seems to have an issue with the file as well :-/ Besides, please note that the Mustang command line expects a A1 (and converts to A3), with Mustang as a library you can also read A3 and write A3. ZUV of course only checks A3.

kind regards Jochen

kind regards Jochen

andrm commented 4 years ago

I'm having the same problem. Verapdf 1.12.1 says everything ok, but mustang library says not compliant. I'm using Apache FOP 2.4. How can I find out what is wrong?

`java.io.IOException: File is not a valid PDF/A input file at org.mustangproject.ZUGFeRD.ZUGFeRDExporterFromA3Factory.ensurePDFIsValidPDFA(ZUGFeRDExporterFromA3Factory.java:106) at

org.mustangproject.ZUGFeRD.ZUGFeRDExporterFromA3Factory.load(ZUGFeRDExporterFromA3Factory.java:79)

`

andrm commented 4 years ago

I used PDFBox to run the verification and output the errors. Apache FOP adds a trailer with XREFs in it. That seems to be forbidden in PDF/A-3a. I changed the generation in FOP to PDF/A1-a and then it works in Mustang.

heisej commented 4 years ago

Same problem for me. I generated a PDF/A-3b document with GhostScript which is reported as valid by ZUV / VeraPDF and pdf-online.com but rejected by the library. As for andrm it works like a charm if I generate a PDF/A-1b.

linkdermink commented 4 years ago

Hi, i have an question about the follow commandline: mustang-1.7.5.jar -e --source " + Datei + " --out " + XML

that Execute appears, after i open an ZUGFerd XML,

Would that also work with AdoptOpenJDK, i don't found Andy Details Information

Thank you, and sry for my english :D

jstaerk commented 4 years ago

Hi

Would that also work with AdoptOpenJDK, i don't found Andy Details Information

if it doesn't please file a separate bug report.

kind regards, Jochen

dularion commented 4 years ago

We have the same issue, with the attached PDF. Using version 1.7.8. of mustang.

Verapdf says its valid:

<?xml version="1.0" encoding="utf-8"?>
<report>
  <buildInformation>
    <releaseDetails id="core" version="1.16.1" buildDate="2020-05-12T00:43:00+02:00"></releaseDetails>
    <releaseDetails id="validation-model" version="1.16.1" buildDate="2020-05-12T00:46:00+02:00"></releaseDetails>
    <releaseDetails id="gui" version="1.16.1" buildDate="2020-05-12T00:59:00+02:00"></releaseDetails>
  </buildInformation>
  <jobs>
    <job>
      <item size="163575">
        <name>/dev/projects/kcenter/src/main/java/MustangGnuaccountingBeispielRE-20170509_505.pdf</name>
      </item>
      <validationReport profileName="PDF/A-3U validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true">
        <details passedRules="126" failedRules="0" passedChecks="11201" failedChecks="0"></details>
      </validationReport>
      <duration start="1600778551182" finish="1600778551694">00:00:00.512</duration>
    </job>
  </jobs>
  <batchSummary totalJobs="1" failedToParse="0" encrypted="0">
    <validationReports compliant="1" nonCompliant="0" failedJobs="0">1</validationReports>
    <featureReports failedJobs="0">0</featureReports>
    <repairReports failedJobs="0">0</repairReports>
    <duration start="1600778551119" finish="1600778551711">00:00:00.592</duration>
  </batchSummary>
</report>

MustangGnuaccountingBeispielRE-20170509_505.pdf

svanteschubert commented 1 year ago

VeraPDF should be used for validation instead of PDFbox, which implements a parser compliant with the ISO-19005 specification (aka PDF/A-1) or to check compliance with PDF/A-1b: https://pdfbox.apache.org/1.8/cookbook/pdfavalidation.html

jstaerk commented 1 year ago

VeraPDF should be used for validation instead of PDFbox, which implements a parser compliant with the ISO-19005 specification (aka PDF/A-1) or to check compliance with PDF/A-1b: https://pdfbox.apache.org/1.8/cookbook/pdfavalidation.html

As already mentioned bilaterally I use PDFBox as validator a) for historic reasons, b) because VeraPDF is big and therefore not included in the mustang-library but only in the all-in-one-bundle mustang-validator . Please feel free to send me a PR which replaces pdfbox by verapdf in the validator version, leaving it in intact for the library and/or upstream your errors to PDFbox.