com_google_fonts_check_074: Why must name ID 0 (copyright) be ASCII only?

m4rc1e commented 6 years ago

In the Ms name table spec, https://docs.microsoft.com/en-gb/typography/opentype/spec/name#enc1 I see no mention of nameID 0 needing to be ASCII.

Imo, there's nothing wrong with having genuine copyright, tm and registered symbols. As long as the characters are within the Mac Roman encoding, https://en.wikipedia.org/wiki/Mac_OS_Roman for Mac entries and the BMP plane for Win entries, I don't think we're violating anything.

The reason I'm bringing this up is because Plex has the following copyright.

© 2017 IBM Corp. All rights reserved.

Am I missing something?

m4rc1e commented 6 years ago

This issue also applies to com_google_fonts_check_019. These symbols shouldn't cause any issues.

anthrotype commented 6 years ago

FYI there's been discussions on the OpenType mailing list on the fact that Macintosh names are no longer needed. Adobe apparently has been shipping fonts without them for some time. http://www.indx.co.uk/biglistarchive/?mode=showpost&id=vbTjEt29gdc9ypGikFbivSkAP

davelab6 commented 6 years ago

The issue is that in cff table there is a copyright notice string that must be ascii, and I want the same string everywhere.

On Feb 19, 2018 6:13 AM, "Cosimo Lupo" notifications@github.com wrote:

FYI there's been discussions on the OpenType mailing list on the fact that Macintosh names are no longer needed. Adobe apparently has been shipping fonts without them for some time. http://www.indx.co.uk/biglistarchive/?mode=showpost& id=vbTjEt29gdc9ypGikFbivSkAP

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/googlefonts/fontbakery/issues/1718#issuecomment-366658800, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP9y4atRFxeU8KJNzOkf4hwDZPNvSfxks5tWVc-gaJpZM4SKXGJ .

felipesanches commented 6 years ago

Is there a practical reason why you want it that way, @davelab6?

m4rc1e commented 6 years ago

Thanks @anthrotype

@davelab6 just checked the .otfs of IBM Plex and they're the same as the ttfs.

© 2017 IBM Corp. All rights reserved.

After reading the spec, I believe that TTFs and CFFs have the same name table and many other tables for that matter. The only real diffs between the formats are the 'CFF ' table in CFF fonts and 'glyf' in TTF fonts.

I agree that if we release CFF fonts, their name table's should match their TTF siblings.

felipesanches commented 6 years ago

Should we keep the ASCII-only requirement then? If so, would you like to add some explanation on the rationale metadata field before closing this issue?

davelab6 commented 6 years ago

Are the cff with the non ascii character valid?

On Feb 19, 2018 1:30 PM, "Felipe Corrêa da Silva Sanches" < notifications@github.com> wrote:

Should we keep the ASCII-only requirement then? If so, would you like to add some explanation on the rationale metadata field before closing this issue?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/googlefonts/fontbakery/issues/1718#issuecomment-366772297, or mute the thread https://github.com/notifications/unsubscribe-auth/AAP9y6w24FkGdeVPk8P9rdem56thw7VVks5tWb2pgaJpZM4SKXGJ .

m4rc1e commented 6 years ago

Yep, they're fine.

davelab6 commented 6 years ago

Oh. Then this issue should be to

[ ] remove the copyright nameID from the com_google_fonts_check_074 check

anthrotype commented 6 years ago

Are the cff with the non ascii character valid?

I couldn't find anything in the CFF specification, nor in the Type 1, nor in the PostScript one, about what is the encoding of these Copyright and Notice strings in the CFF TopDict.

In fontTools.cffLib they are read/written as "latin1" (which is a super-set of ascii, and does include "©" COPYRIGHT symbol).

https://github.com/fonttools/fonttools/blob/d46d40dc9e1811d47dab07920ced6aeac9068586/Lib/fontTools/cffLib/__init__.py#L1787-L1788

However in ufo2ft (originally ufo2fdk) we drop all non-ascii characters plus some special characters which are invalid in the context of a postscript string like [](){}<>/%

https://github.com/googlei18n/ufo2ft/blob/0cf1aa3d8bff4ef062fb4c2261106957bffa77dd/Lib/ufo2ft/fontInfoData.py#L196-L215

Maybe @readroberts could tell us more?

m4rc1e commented 6 years ago

@anthrotype Thank you for taking the time to explain this.

I'm not sure we'll ever release type 1 fonts. However, it may be best to stick with what we've got.

I see we've already had a heated discussion on this test, https://github.com/googlefonts/fontbakery/issues/1663

anthrotype commented 6 years ago

I'm not sure we'll ever release type 1 fonts

of course not. I mentioned Type1 because the CFF spec refers to that for the meaning of those TopDict operators. The text encoding is still unclear, but ascii is indeed the safest bet for CFF strings. What you're mostly interested in is not that it doesn't display garbage, because those strings are dead weight and they will never be seen by human beings -- unless one opens up the font in a hex editor or decompiles it with ttx, or perhaps if the CFF font is embedded in a PDF (not sure). It's the name table strings the ones the user will be shown. What you're interested in is that the rest of the CFF table isn't rejected because of those useless pieces of string.

felipesanches commented 6 years ago

What you're interested in is that the rest of the CFF table isn't rejected because of those useless pieces of string.

I guess this can probably be the most important insight we've got in this conversation! Thanks for bringing this up, @anthrotype !

If that's really the major reason for us avoiding non-ASCII strings on some of the name table entries (such as copyright), then it may be good to mention that in the check rationale metadata field. Can anyone acknowledge that this is indeed the case?

Also, it would be good to have a precise but succinct way of expressing that on the rationale text for check/074 - "Are there non-ASCII characters in ASCII-only NAME table entries?".

The rationale metadata field currently reads like this:

The OpenType spec requires ASCII for the POSTSCRIPT_NAME (nameID 6).
For COPYRIGHT_NOTICE (nameID 0) ASCII is required because that
string should be the same in CFF fonts which also have this
requirement in the OpenType spec.

Note:
A common place where we find non-ASCII strings is on name table
entries with NameID > 18, which are expressly for localising
the ASCII-only IDs into Hindi / Arabic / etc.

anthrotype commented 6 years ago

Or maybe you could relax the requirement that the nameID 0 be exactly equal to CFF Copyright string..

readroberts commented 6 years ago

When CFF data is present as a CFF table in an OpenType font, there is a bunch of CFF data which is not relevant, and should be ignored in favor of the equivalent data in other tables - see the differences between the CFF2 and CFF table definitions. The CFF Copyright should be ignored, and certainly can be different than name table ID 1 - Adobe uses '@' in name id 0, but not in the CFF Copyright string. The specification of the encoding of a copyright string in CFF is indeed poorly specified. The PostScript Language Reference Manual says that the 'Notice' field is a 'string' type, and then in section 3.3.7, says the encoding is unspecified and is whatever you want - the only limitation stated is that to enhance portability for strings that are present literally in the data, is charset can be limited to ASCII, but I haven't seen a tool that will reject a font for any encoding in the 'string' fields. I now recommend omitting the Notice or Copyright field from the CFF table.

anthrotype commented 6 years ago

Thank you Read!

tphinney commented 6 months ago

Regardless of what Read most recently recommended for new fonts, AFAIK Adobe has continuously released OpenType CFF fonts from at least 2000 to very recently, with the copyright symbol. Back in the day it was a requirement for every Adobe font.

Minion Pro from early OpenType releases

    <namerecord nameID="0" platformID="1" platEncID="0" langID="0x0" unicode="True">
      © 2000, 2002 Adobe Systems Incorporated. All Rights Reserved. U.S. Patent Des. 337,604. Other patents pending.

Minion Variable Concept released around the time of Read’s comment

    <namerecord nameID="293" platformID="1" platEncID="0" langID="0x0" unicode="True">
      MinionConceptRoman-DisplayBold
    </namerecord>
    <namerecord nameID="0" platformID="3" platEncID="1" langID="0x409">
      © 1990-2018 Adobe. All rights reserved.

fonttools / fontbakery

com_google_fonts_check_074: Why must name ID 0 (copyright) be ASCII only? #1718