pdf-association / arlington-pdf-model

A vendor- and implementation-independent specification-derived, machine-readable model of PDF.
Apache License 2.0
77 stars 6 forks source link

Undocumented use of @<key name> for an array of objects #57

Open bdoubrov opened 1 year ago

bdoubrov commented 1 year ago

The A key for Target object uses the expression fn:Eval(@A==fn:PageProperty(@P,Annots::@NM)) for possible values. It looks like Annots::@NM tries to reference a key in the array of objects, which is not defined by the internal grammar.

petervwyatt commented 1 year ago

Yes it is - see https://github.com/pdf-association/arlington-pdf-model/blob/master/INTERNAL_GRAMMAR.md#validation-of-predicates-declarative-functions, 7th bullet point.

I may not have been consistent with implementation or usage, but the intention has always been there and documented.

MaximPlusov commented 1 year ago

In this case, the problem is a part of path is missing between Annots and @NM

petervwyatt commented 1 year ago

Ahh! OK - I misunderstood.

The issue is for A in Target.tsv (Table 205) trying to encode the last sentence specific to string-text: "If the value is a text string, it specifies the value of NM in the annotation dictionary (see "Table 166 — Entries common to all annotation dictionaries")."

The core issue is that Annots is an array on page objects but the target dictionary could be "deeper" in a page's DOM tree (which is why fn:PageProperty exists), and at some unknown element in the page's Annots array, an annot dict needs to have a specific NM entry with a value equal to @A of the Target dictionary. Semantically this might be more accurate: Annots[*]::@NM but I do NOT want to use [/] as it makes parsing much harder. So I'm left with introducing a new predicate.

For the integer case of A, the current predicate reads OK:

which also implies fn:PageProperty(@P,Annots) returns an array. Predicates are also limited to only 2 arguments so if the new predicate is fn:FindNMValueInArray(array,value) then the "SpecialCase" for string-text becomes:

which reads aloud quite similar to the spec wording...