fonttools / fontbakery

🧁 A font quality assurance tool for everyone
https://fontbakery.readthedocs.io
Apache License 2.0
553 stars 103 forks source link

CJK fonts require specific OS/2 Codepage metadata #2598

Open davelab6 opened 5 years ago

davelab6 commented 5 years ago

CJK fonts will not be loaded in MS Office and perhaps elsehwere if OS/2 codepage metadata is not set as expected

@chrissimpkins please provide the details :)

chrissimpkins commented 5 years ago

This information came from a conversation with Sergey Malkin at Microsoft during our CJK localized regional variant / GSUB 'locl' feature support testing on MS Office products. Microsoft Word requires bits in OS/2.ulUnicodeRange(1-4) andOS/2.ulCodePageRange(1-2) to be properly set in order to use the font to display TC, SC, J, and K language/locale-specific glyphs at all. According to the Microsoft team if they are not set correctly, the entire font can be rejected by Word. You get a fallback system font that supports the glyphs in the document rather than the desired font that is selected in the Font dropdown menu. The font menus do properly display the type family. You just can't highlight text, select the family, and rasterize shapes with that typeface.

This appears to affect MS Word on both macOS and Windows platforms. It may affect other MS Office applications. It may affect type intended to support non-CJK writing systems as well. We have not investigated these issues in any detail.

chrissimpkins commented 5 years ago

Associated check: com.google.fonts/check/code_pages

felipesanches commented 4 years ago

I am porting noto_lint and one of the checks deals with unicoderanges. The same could be done with codepages as well.

Do you think this makes sense? Or should we use some arbitrary threshold level on the check? Screenshot from 2019-12-03 01-56-38

Note: I know that the Tomorrow family used in the example above is not a CJK family, but I think the check could be useful for all families, not only CJK.

chrissimpkins commented 4 years ago

This is a very useful check IMO Felipe. Both for bits that are unset and should be set as well as those that are set and should not be. By arbitrary threshold are you referring to % range coverage to warrant bit to be set? This is my question as well.

felipesanches commented 4 years ago

yes

felipesanches commented 4 years ago

the opentype spec says that bits indicate functional ranges and that "the determination of 'functional' is left up to the font designer". So it seems the threshold is up to us... what would be a reasonable value?

felipesanches commented 4 years ago

@marekjez86 may be interested in this issue as well

chrissimpkins commented 4 years ago

what would be a reasonable value?

'functional' coverage may be a very different value across the character ranges. How about something like this?

Without a detailed review of the ranges and some specification to define functional coverage levels this may be as close as we get to balancing automation with designer determined functionality. However, it may lead to lots of new WARN messages that are not required after the first pass through the fonts and a review of the glyph set. It would be good to have a way to easily eliminate this test once the information is known / available to a team and indicate this in the rationale string.

chrissimpkins commented 4 years ago

Also, is there a readily available gftools script to set and clear these bits? If so, we should point to that in the rationale. If not, perhaps this would be a useful addition.

khaledhosny commented 4 years ago

My recommendation is to set the bit if the font has any character in the corresponding Unicode range/code page, otherwise many Microsoft/Windows apps will ignore any such characters.

felipesanches commented 4 years ago

@khaledhosny are we able to name at least a few examples of such programs? I think that your observation could lead us to consider making this a FAIL-level check. But I am afraid we may be creating a new instance of "dogma" if we do not explicitly state at least some examples of programs in which the issue is known (and verified) to be real.

For instance, is this issue restricted to CJK fonts or maybe only a few other specific ranges? Or would this bug on windows apps also affect absolutely any other glyph range when bits are not properly set? I think it is important for us to be very precise at this stage, in order to document what are the real issues at hand.

felipesanches commented 4 years ago

To illustrate the state of things regarding this, the image below is a screenshot of a partial run of this new unicodeRange bits check on the Google Fonts collection:

Screenshot from 2019-12-03 10-32-18

Unless there's some bug in my current implementation, this likely means most fonts have poorly set values for os/2.unicodeRange. I really hope that there's still some bug in my implementation. Or that we may have a more reasonable set of criteria, because if that's not a check-code bug, then it seems to be a very strict requirement.

chrissimpkins commented 4 years ago

likely means most fonts have poorly set values for os/2.unicodeRange

Likely to be expected given how vague the definition is... :)

...bug on windows apps...

I wouldn't consider this a MS application or Windows platform specific bug that warrants a workaround. This is intentional defined behavior in some MS applications as I understand the issue and will require font fixes to support use in some MS applications for (at least some) code point ranges.

MS Word on Windows 10 is an example application. You can attempt to replicate this issue in MS Word by:

My experience with this using a set of CJK fonts in development with a subset of a "full" CJK glyph set and compiled with ufo2ft without these bits properly set was that the typeface family name shows up in the MS Word menu; however, when you select it in an attempt to redefine the font used for text in the document, you get a fallback system font instead. It simply snaps to that typeface definition without warnings/error messages. It is an issue that you cannot override through the application menus. See https://github.com/googlefonts/fontbakery/issues/2598#issuecomment-515553784 for the full description.

As Khaled suggested, perhaps bits should simply always be set when any glyph is defined in a range. This would mean that there could be situations where users will select a family that does not fully cover a range and they either get .notdef glyphs or intermixed desired typeface and fallback typeface display. I haven't looked into this behavior. This bumps the "functional" definition to the end user and opens up the possibility of defining the font at all for strings that include range subsets that might be defined in the typeface. I suppose there is not much downside to that and maybe functional should be defined by the end user based on their needs anyways?

felipesanches commented 4 years ago

would you see any potential damage caused by a bulk update of the entire GFonts collection following the strict rule that a bit must be set if even a single codepoint in a given range is provided?

chrissimpkins commented 4 years ago

any potential damage caused by a bulk update of the entire GFonts collection

Can we implement this check in the universal profile before you have a definitive answer to this question?

felipesanches commented 4 years ago

It is currently in the notofonts profile (not merged yet). It will probably later be migrated into the universal profile, I think.

chrissimpkins commented 4 years ago

OK we will implement it locally in a custom profile then. Thanks.

drj11 commented 6 months ago

Having just tried it, it would be good to put com.google.fonts/check/unicode_range_bits in the universal profile. Please :)