pdf-association / pdf-issues

Industry-based resolutions for issues and errata reported against any PDF-related specification
https://pdf-issues.pdfa.org/
67 stars 2 forks source link

How should a processor deal with a file referencing a deprecated character collection? #231

Open MPBailey opened 2 years ago

MPBailey commented 2 years ago

ISO 32000-2:2020 3.15 says:

deprecated a part of ISO 32000 that should not be written into a PDF 2.0 document, and should be ignored by a PDF processor (3.49) Note 1 to entry: In some cases variations on these restrictions on continued use of a deprecated feature are explicitly stated in this document.

ISO 15930-9:2020 (PDF/X-6) says:

Any feature in ISO 32000-2 that is marked as deprecated should not appear in a PDF/X-6 file, but a conforming processor shall process the file as if the features were not present.

I was trawling 32000-2 to find any specific deprecated features that might cause problems as a result of that shall requirement.

32000-2 says that the Adobe-Korea1-2 character collection for CID-keyed fonts is deprecated (1st para below Table 117 and elsewhere). Take care that “Adobe-Korea1” is used as a synonym for “Adobe-Korea1-2”, as described in 4.2.

So that means that a PDF/X-6 processor reading a file that references Adobe-Korea1-2 “shall process the file as if that [reference] were not present”; I don’t understand how developers can build on that statement to generate correct and consistent output. Even a baseline 32000-2 processor “should” ignore the reference.

All of this means that we need:

  1. a clear statement of required or recommended behaviour when processing a 32000-2 file that, against recommendations, references Adobe-Korea1-2. I believe that it should be done as normative text in 32000-2 9.10.2. Leonard tells me that it’s not appropriate to simply use Adobe-KR-9 instead, but that a processor should use some form of mapping table from Korea1-2 to KR-9. I’m not enough of a font expert to formulate a reasonable statement.

  2. A statement for consideration in the event that a dated revision of 15930-9 is ever developed, possibly as a new section 6.8.6 to state that the Adobe-Korea1-2 character map shall not be used in a PDF/X-6 file.

I suspect that all of the same arguments and responses would also be appropriate for the deprecated Adobe-Japan2-0 character collection.

petervwyatt commented 1 year ago

See also #77 and #342 - linking the 3 font-related errata with potential impact for PDF/X-6 and PDF/A-4

bdoubrov commented 1 year ago

Adding also PDF/A-4 label. Another place in ISO 19005-4 affected by this issue is ISO 6.2.10.7 "Unicode character maps":

  • Type 0 fonts whose descendant CIDFont uses the Adobe-GB1, Adobe-CNS1, Adobe-Japan1 or Adobe- KR-9 character collections.

It does no longer mention Adobe-Korea1, which implies that either PDF/A-4 fonts shall not be used in PDF/A-4 (PDF/X-6) documents, or that all such fonts shall have /ToUnicode mapping.

Personally, in case of PDF/A-4 and PDF/X-6 dated revision I'm in favor of not permitting documents conforming to these standards to use Adobe-Korea1.

petervwyatt commented 1 year ago

PDF TWG agree that deprecated features are not be used in PDF/A-4 and PDF/X-6 - this can be done editorially with a note since all deprecated features are not to be included.

DietrichSeggern commented 1 year ago

PDF/A-4 only says that deprecated features "shall not be used to render". In PDF/X-6 that is only a should. That is because it is difficult to validate.

This would have to be taken into account for the note.

bdoubrov commented 11 months ago

To minimize the impact of the changes, PDF/A TWG suggests to make an exception for CMaps deprecated in ISO 32000-2.

petervwyatt commented 9 months ago

Same as for errata #77 - parking until PDF/A TWG complete their article/whitepaper. Assigned to Boris as chair of PDF/A TWG.