pdf-association / pdf-issues

Industry-based resolutions for issues and errata reported against any PDF-related specification
https://pdf-issues.pdfa.org/
67 stars 2 forks source link

Tables 357 and 358, Pg wording clarifications #431

Closed petervwyatt closed 5 months ago

petervwyatt commented 6 months ago

Found while researching #343.

Table 357 (marked-content reference dictionary) and Table 358 (Object reference dictionary), Pg entry is described as "optional", but in keeping with elsewhere it should be "Sometimes required" since the very last portion of the description says "it shall be used if the structure element has no such entry" which can only apply if it is always present for that condition.

myang-apryse commented 6 months ago

Posting & expanding here as well: We might want to simplify (and/or aggregate) condition on requirement of Pg entry to "leafmost dictionary dominates, ancestry chain must have at least 1".

Currently, implicitly I think, if you aggregate all the Pg entry "sometimes required" statements, it's technically more strict in that the leafmost structure element node, i.e. parent of MCID, MCR, OBJR, must have it if the content references do not have it or are incapable of having it (non-dictionary MCID). Or rather, if the immediate parent does not have it, then the MCR/OBJR must have it (even if an ancestor has it)

If the immediate parent requirement is intentional (e.g. for efficiency of checking), then perhaps it's worth explicitly ruling out the ancestors? The aggregate condition description might still be useful with changed wording.

petervwyatt commented 6 months ago

@myang-apryse My interpretation of the current wording does not imply any "ancestry chain" but always just the immediate ancestor (parent/containing) structure element - which is a more restrictive file format requirement as you note. I think my proposed new wording also carries this explicitly too... or did you interpret the new words to include deeper ancestors?

My intention wasn't to change anything or explain further - just align language usage with elsewhere in the spec.

myang-apryse commented 6 months ago

or did you interpret the new words to include deeper ancestors?

No, I believe you maintained the original semantics just fine. It's just that I think I've seen behavior/logic that traverses up the ancestors, and I was curious if it was explicitly forbidden.

I also believe that it's simpler if we include the ancestors, that particular opinion need not be followed, though maybe explicitly ruling it out would be prudent since it seems natural.

If I were to nitpick, I guess you can consider the immediate parent as having a Pg entry if it were inherited from an ancestor? I actually forget the exact rules of such inheritance, so I might be totally off here.

petervwyatt commented 6 months ago

It may well be that some implementations add additional functionality to support malformed files (such as those missing Pg)... and the spec doesn't prohibit that, but also won't define error handling behaviours.

petervwyatt commented 5 months ago

PDF TWG agree