pdf-association / pdf-issues

Industry-based resolutions for issues and errata reported against any PDF-related specification
https://pdf-issues.pdfa.org/
63 stars 2 forks source link

How should a processor deal with a file referencing a deprecated character collection? #231

Open MPBailey opened 1 year ago

MPBailey commented 1 year ago

ISO 32000-2:2020 3.15 says:

deprecated a part of ISO 32000 that should not be written into a PDF 2.0 document, and should be ignored by a PDF processor (3.49) Note 1 to entry: In some cases variations on these restrictions on continued use of a deprecated feature are explicitly stated in this document.

ISO 15930-9:2020 (PDF/X-6) says:

Any feature in ISO 32000-2 that is marked as deprecated should not appear in a PDF/X-6 file, but a conforming processor shall process the file as if the features were not present.

I was trawling 32000-2 to find any specific deprecated features that might cause problems as a result of that shall requirement.

32000-2 says that the Adobe-Korea1-2 character collection for CID-keyed fonts is deprecated (1st para below Table 117 and elsewhere). Take care that “Adobe-Korea1” is used as a synonym for “Adobe-Korea1-2”, as described in 4.2.

So that means that a PDF/X-6 processor reading a file that references Adobe-Korea1-2 “shall process the file as if that [reference] were not present”; I don’t understand how developers can build on that statement to generate correct and consistent output. Even a baseline 32000-2 processor “should” ignore the reference.

All of this means that we need:

  1. a clear statement of required or recommended behaviour when processing a 32000-2 file that, against recommendations, references Adobe-Korea1-2. I believe that it should be done as normative text in 32000-2 9.10.2. Leonard tells me that it’s not appropriate to simply use Adobe-KR-9 instead, but that a processor should use some form of mapping table from Korea1-2 to KR-9. I’m not enough of a font expert to formulate a reasonable statement.

  2. A statement for consideration in the event that a dated revision of 15930-9 is ever developed, possibly as a new section 6.8.6 to state that the Adobe-Korea1-2 character map shall not be used in a PDF/X-6 file.

I suspect that all of the same arguments and responses would also be appropriate for the deprecated Adobe-Japan2-0 character collection.

petervwyatt commented 8 months ago

See also #77 and #342 - linking the 3 font-related errata with potential impact for PDF/X-6 and PDF/A-4

bdoubrov commented 8 months ago

Adding also PDF/A-4 label. Another place in ISO 19005-4 affected by this issue is ISO 6.2.10.7 "Unicode character maps":

  • Type 0 fonts whose descendant CIDFont uses the Adobe-GB1, Adobe-CNS1, Adobe-Japan1 or Adobe- KR-9 character collections.

It does no longer mention Adobe-Korea1, which implies that either PDF/A-4 fonts shall not be used in PDF/A-4 (PDF/X-6) documents, or that all such fonts shall have /ToUnicode mapping.

Personally, in case of PDF/A-4 and PDF/X-6 dated revision I'm in favor of not permitting documents conforming to these standards to use Adobe-Korea1.

petervwyatt commented 8 months ago

PDF TWG agree that deprecated features are not be used in PDF/A-4 and PDF/X-6 - this can be done editorially with a note since all deprecated features are not to be included.

DietrichSeggern commented 8 months ago

PDF/A-4 only says that deprecated features "shall not be used to render". In PDF/X-6 that is only a should. That is because it is difficult to validate.

This would have to be taken into account for the note.

bdoubrov commented 7 months ago

To minimize the impact of the changes, PDF/A TWG suggests to make an exception for CMaps deprecated in ISO 32000-2.

petervwyatt commented 5 months ago

Same as for errata #77 - parking until PDF/A TWG complete their article/whitepaper. Assigned to Boris as chair of PDF/A TWG.