Closed seehuhn closed 6 months ago
A few notes:
Annex H is very old and was not maintained for PDF 2.0 - there are probably some outdated and deprecated features being used
XMP is only ever metadata, nothing more. And metadata is always optional for "general-purpose PDF" - but it is required for ISO subsets such as PDF/A, PDF/UA and PDF/X as per their specific standards. The trailer ID entry is "real" PDF data and used with encryption (see Table 15).
Understood. (But note that the XMP metadata stream was not shown in the examples in the PDF 1.7 spec. It seems to have been added for the 2.0 spec.)
I think the best solution may be to simply remove the following lines from all XMP examples:
<rdf:Description rdf:about="" xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/">
<xmpMM:DocumentID>… unique GUID of document …</xmpMM:DocumentID>
<xmpMM:InstanceID>… GUID changed for each save …</xmpMM:InstanceID>
</rdf:Description>
The PDF spec seems like an odd place to explain the xmpMM:DocumentID
and xmpMM:InstanceID
properties, and there seems to be little benefit in showing these entries in the examples at all.
I agree. ISO 32K-2 doesn't need to spell out anything to do with the internals of XMP for "general PDF" - that's the job of XMP spec or the PDF ISO subsets where lots of specific things are required.
@petervwyatt What is the proposed changed here?
In Annex H, remove all the XMP gory micro-details (since that is the job of the XMP spec) and just leave block comments of what the XMP needs to represent - and NOT explain things like which xmpMM things to be preserved or updated. Search for "xmpMM:" to see the 2 examples in Annex H.
PDF TWG agree
PDF/A TWG doesn't see any immediate need for any notes on how to align XMP-based ID's with trailer IDs. The use of this data is very different in various implementations.
The "minimal PDF file" in appendix H.2 uses the
xmpMM:DocumentID
andxmpMM:InstanceID
properties in its XMP metadata stream, and explains that these properties are a "unique GUID of document" and a "GUID changed for each save", respectively. The purpose of these fields seems very similar to the two elements of theID
array in the file trailer dictionary, as introduced in Section 14.4 (File identifiers).It would be nice if the PDF spec explained the relation between these two pairs of identifiers: Are writers mean to generate two sets of independent identifiers for each document, or can/should/shall the XMP identifiers be somehow derived from the PDF file identifiers?
Also, are the XMP identifiers required or optional? (If optional, maybe don't show them in the "minimal file" example?)