pdf-association / arlington-pdf-model

A vendor- and implementation-independent specification-derived, machine-readable model of PDF.
Apache License 2.0
77 stars 6 forks source link

Minor rule issues #36

Closed faceless2 closed 2 years ago

faceless2 commented 2 years ago

I have now run a check over about 800 files - first observation is there are a lot of faults in a lot of PDFs! And I agree with the point you made a few weeks back - the version number in the file is basically fictional.

I did find a few faults in the model, but not looking too bad overall. There are undoubtedly more, but there is just so much noise from actual failures they're hard to spot.

petervwyatt commented 2 years ago

Signature - the SubFilter field makes "adbe.x509.rsa_sha1" allowed after 1.6, should always allowed (been there since 1.3)

ISO 32000-2:2020 Table 255 has an italic "(PDF 1.6)" indicator which is therefore incorrect. ISO 32000-1:2008 is the same presumably because Adobe PDF 1.7 said "_PDF 1.6 defines the following values for public-key cryptographic signatures: adbe.x509.rsa_sha1, adbe.pkcs7.detached, and adbe.pkcs7.sha1_" . Adobe PDF 1.3 and PDF 1.4 fail to mention any values for SubFilter (even searching for just "x509" returns zero results) - PDF 1.3 Table 7.53 and PDF 1.4 Table 8.60 do not list any values. Adobe PDF 1.5 Table 8.93 was the first PDF reference that explicitly mentioned the 3 values incl. "adbe.x509.rsa_sha". So from a specification-centric PoV the first doco for the value was in PDF 1.5, not PDF 1.6.

I also note that the sentence starting "Other values may be ..." should really be a new para in Table 255 in ISO 32000-2:2020 - it is not PDF 2.0 only.

petervwyatt commented 2 years ago

See Errata #219

petervwyatt commented 2 years ago

For VRI, I don't think purely optional is quite the same. I think it means that if the parent DSS dictionary have non-empty OSCPs or CRLs arrays then those keys are required in the VRI dictionary. I've captured this as fn:IsRequired(fn:ArrayLength(parent::CRLs)>0)

petervwyatt commented 2 years ago

See Errata #191 for BorderThickness in StandardLayoutAttribute*. I agree it should be a number, but the spec is ambiguous if it is >0 or >=0!

petervwyatt commented 2 years ago

For XObjectFormType1, Resources: clause 7.8.3 was reworded in PDF 2.0 (3rd bullet): "For other content streams, a PDF writer shall include a Resources entry in the stream's dictionary specifying the resource dictionary which contains all the resources used by that content stream. This shall apply to content streams that define form XObjects, patterns, and annotation appearances.". And, yes, that somewhat contradicts / amends Table 93.

I know there are ongoing discussions regarding this wording but, for now (such as Errata #128), I'd like to leave it to reflect the words as currently stated in ISO 32000-2:2020 and not yet corrected by any errata.

petervwyatt commented 2 years ago

For Collection I didn't quite do as you suggest because I always try to keep the wording as per the spec so if the predicates were read aloud they would be an obvious match (rather than an alternate boolean expression). But I think it is correct now...

petervwyatt commented 2 years ago

Everything addressed. Thanks!

faceless2 commented 2 years ago

Sorry to come back to some of these, but:

I'm starting to worry I've misunderstood the grammar again, but I can see in the current version of the code that fn:IsPresent will always return a boolean, never null. Same for fn:IsEncryptedWrapper.


It feels like you're reaching for something more like (@Matte && (@Width==parent::@Width)) for the first - if that's where you end up going, a note in INTERNALGRAMMAR.md like (eg) "when evalatued in a boolean context , non-boolean item are equivalent to true" (or whatever you decide the rule is) might be warranted.

petervwyatt commented 2 years ago

I'm after a more logic-predicate-like method for expressing things than a functional programmatic (@Matte && (@Width==parent::@Width)) statement. I will investigate a 2-parameter version of fn:IsPresent(condition1,condition2) so it would become fn:Eval(fn:IsPresent(Matte,(@Width==parent::@Width))) which reads aloud nicely as "Only when Matte is present, then the value of Width shall be equal to ..."

And please don't rely on the code for any significant meaning. It is very much just PoC and thinking out loud...

petervwyatt commented 2 years ago

Collection/View corrected for PDF 2.0 special case