STAT derived instances in legacy RBIZ applications

davelab6 commented 4 years ago

@reli-msft posted on the mpeg-otspec list:

I've seen many reports about STAT-derived instance names in legacy applications — especially the applications using RBIZ (Regular-Bold-Italic-BoldItalic) family model — caused problems, especially name truncation, since these applications use legacy APIs that have a rather strict limit of derived family names, and is hard to change due to its strict binary compatibility requirements. Such APIs include GDI, and perhaps protocols used in printing.

To cope with such issues, I've came up with ideas including these options:

The Spec to specify that, font API implementations should automatically hide the derived instance or generate a compatible but artificial name for derived instances, if its name does not meet the API's limitations
Add an extra flag HIDE_IN_LEGACY_RBIZ_APPLICATOINS = 4 into STAT's axis value tables, denoting that this axis value should be hidden in legacy applications, and instance derivation is worked as status quo.

So are there any thoughts about this problem and my purposed solutions?

davelab6 commented 4 years ago

I agree this is needed.

A related problem is that if generating a set of static fonts from all stat table named instances, the set can easily reach into the 1,000s.

I think it may be best to have a "star instances" list of the top X instances, and only instantiate those, not the full matrix of all n2n axes locations.

But even with this, a specified algorithm to define how such instances should be named will be very useful to prevent user confusion, that I foresee if different vendors do instantiation differently.

reli-msft commented 4 years ago

@davelab6 I think the problem of large derived matrix (or should I say tensor since it may have more dimensions?) is more like a UI design issue: The UI should reflect some of the deriving logic by organizing the subfamilies into grids or trees. Different platform's limitations are different, so the algorithms may not be clearly defined in the Spec. However the Spec could provide guides about the deriving behavior.

Lorp commented 4 years ago

When you say "automatically hide the derived instance" do you mean "automatically hide the derived instance name"?

I agree font APIs should be able to generate compatible names when a derived name is not compatible, but can you explain why the font should have to explicitly request the API does this?

reli-msft commented 4 years ago

@Lorp They may not explicitly request, but considering a subfamily ExtraExtended ExtraLight Oblique. When deriving it into GDI, this subfamily name will act as a suffix added to the family name to produce the RBIZ-compatible family name, and the length of the subfamily suffix itself already breaks LOGFONTW's lfFaceName length limit.

Therefore, the font API (GDI here) should hide this entire instance to avoid name trimming or conflict.

behdad commented 4 years ago

I disagree and object to this. Legacy APIs should figure out what they do by themselves, not put junk forward into a format that will be carried for decades.

davelab6 commented 4 years ago

Users first, behdad.

behdad commented 4 years ago

The Spec to specify that, font API implementations should automatically hide the derived instance or generate a compatible but artificial name for derived instances, if its name does not meet the API's limitations

So, what would happen if that flag is NOT set?

Add an extra flag HIDE_IN_LEGACY_RBIZ_APPLICATOINS = 4 into STAT's axis value tables, denoting that this axis value should be hidden in legacy applications, and instance derivation is worked as status quo.

How would you define legacy applications?

Seriously. This CANNOT be addressed in font format. Is an application limitation and only such application can know and decide what to do about it.

reli-msft commented 4 years ago

My two options are disjunction instead of conjunction, so the solution is either

The Spec explicitly adds ~~an implementation guide~~ compliance requirement that, when an API implementation is deriving an instance, if its name(s) breaks its interfaces' limits, what should it do, like abandon this instance or generate an artificial (hash or UUID) name instead; and what it shouldn't do, like truncating the names.

OR

The Spec introduces a new flag so legacy API implementations could respect this flag and not derive instances containing the flagged items. Legacy API here could be defined as APIs that follow RBIZ model.

The spec could add both of them if necessary.

twardoch commented 4 years ago

I agree with Behdad:

If we add a flag that hides a name, GDI would need to be changed so that it checks that flag and then hides the name. The reason for hiding the name would be that "otherwise it might break the limit"
If we don't add the flag, GDI would also need to be changed if the name breaks the limit, and then hide that name.

So GDI would need to be changed either way. The flag accomplished nothing.

Also: the flag puts the responsibility on the font developer to set it, which means the developer would need to know WHEN to set it. But this would require knowledge by the developer of how the legacy app works. "Set this flag if you know that the derived maker might break the length limit in a legacy app." Well, wait — it's the legacy app that best knows when the name breaks its limit, NOT the font maker!

If we add this flag, both font creation tools AND the "legacy apps" need to be changed. If we don't add this flag, only three legacy apps may need to change, differently — they need to check the name length limit, but perhaps they already do it anyway, no? Name length limits are not variation-specific anyway.

The legacy app CAN be changed so that it builds the virtual RBIBI family names (nameID 1 equivalents) from STAT records BETTER. There is no obligation that the legacy app has to concatenate all STAT records in the combination. If there is a length limit, the legacy app might construct the virtual nameIDs by some abbreviation. Or it might use numbers, or whatever.

Finally, the legacy app needs to handle situations when the name limit breaks bit the flag is not set.

Sooo... I don't really see why this flag is needed.

A.

twardoch commented 4 years ago

The Spec explicitly adds an compliance requirement that, when an API implementation is deriving an instance, if its name(s) breaks its interfaces' limits, what should it do, like abandon this instance or generate an artificial (hash or UUID) name instead; and what it shouldn't do, like truncating the names.

A compliance requirement for legacy APIs that cannot really properly deal with variable fonts anyway? That'd be strange. Generally, the OT spec currently lists various fields in the name table, and provides lightweight guidance as to how some of these fields may be interpreted.

As an extra thing, OT links to the Adobe document which describes how apps should build PostScript names. PostScript names can be generally viewed as “legacy names” and as such are Adobe-specific.

Microsoft could author a guide that describes how its own “legacy API” generates “virtual” nameID 1 entries from STAT.

But the font format spec should have a compliance requirement to support legacy apps, including Windows GDI. The fonts should provide reasonable info, and the spec may link to those non-normative guides so that implementers can take them into account.

There is a need for documenting how various APIs perform name-based font selection based on the data provided in the fonts. This aspect could be covered there. But I don't see how this should be a compliance-enforced portion of the font format spec! Yes, it can be explanatory data for how current apps do it, and if at all, the format spec or the said "font selection documentation" should provide guidance for the way forward, i.e. how to best use the data that is in the font, so that app developers have a chance to use it.

But spec compliance? There is no requirement that if the name table has Japanese name strings, the apps must show them if the locale is Japanese. Apps do all sorts of things in their font selection UIs and we cannot restrict them (but we should explain the intentions, and what popular font APIs do).

But the whole app ecosystem is so diverse and there are so many approaches and developments that I don't see how feasible it might be to have compliance tests for apps. Would an app that disregards the GDEF table or ignores the name table altogether non-compliant? What about apps that ignore some TT instructions? If we made such compliance test suite now, all apps would fail.

We can make a compliance test suite for fonts, but not really for apps that use them!

Specifically, font selection is an area where there are many approaches, and rightly so. App developers should be encouraged to make better UIs — and UIs that are fit for purpose. I don't see how compliance fits into the setup.

reli-msft commented 4 years ago

@twardoch I think the Spec should include "requirements" and also "recommendations" and "permitted actions". Verbs must/should/may/can could be used here.

For the STAT issue, "Implementations that derives (RBIZ) virtual families from the STAT table must not truncate names" is a requirement, and "Implementations that derives (RBIZ) virtual families from the STAT table may hide the derived instances if the name is too long" is a permitted action.

In my idea, for the Spec, requirements are mostly for preventing things that could go wrong, and recommendations / permitted actions are for suggesting what is the right move.

punchcutter commented 4 years ago

Currently I don't see anything on Windows using "derived" names, but a related (or maybe the same?) problem is that Windows uses the STAT to build the named instance names which already exist in the fvar. Unless this is what is meant by "derived names" those names already exist in the fvar.

Everything other system and app uses the fvar from what I can see. I've only checked macOS, Ubuntu, Adobe Ps/Ai/Id, fontview, OT Master, and Samsa. They all use the fvar names. Windows using the STAT names means that everything might look fine in Font Settings, but if a combination of STAT names for multiple axes becomes too long then we run into the limit in apps like Word and we start losing styles.

If the fvar names were used then I might not have noticed this. The reason I ran into it originally was that I built a huge family with opsz, wdth, wght, ital axes and made sure that the fvar style names would still work for the 32 char limit in Word. Then I built the variable version and was surprised to see the menu names become something other than what I had specified in the fvar. I don't believe this is how the STAT table should be used for named instances and the latest update to the spec seems more clear about it:

The information provided in the 'STAT' table includes string labels for specific style attributes. For example, “Bold” and “Condensed” as individual style attribute labels within a “Condensed Bold” font. These may be used in user interfaces but are not intended to supersede subfamily names provided in the 'name' table or in the 'fvar' table of variable fonts.

I think in general it's good if the STAT can construct the same names that are defined in the fvar, but I don't see that as a requirement and I always understood the STAT to be intended specifically for derived names, not named instances.

@reli-msft When you say "derived instances" are you including the fvar-defined named instances in that?

reli-msft commented 4 years ago

@punchcutter In my original post the term used is "STAT-derived instance names", so this thread is about names of instances that built from STAT, no matter where the instances came from. There are a lot of paragraphs describing how the name derivation for RBIZ / WWS / Typographic family model worked.

However, when talking about to hide something, I used the term "to hide an instance", which usually mean that if you have a font selection API or a font selection UI, this instance could not be selected, since selecting it may cause an overflow or truncation.

PeterConstable commented 4 years ago

Quick comment:: one intended use for STAT was to be able, given a font in a typographically-rich family with many axes (more than wght and ital/slnt, or wght, wdth, ital/slnt) or with many values on any axis, to derive instance names that would be compatible with more limited "RBIZ" or WWS family models.

It was not, however, to provide compatibility with any particular APIs or any legacy name-length limitations.

Those may, of course, be a constraining condition that some important applications today need to deal with. And so some font developers might conclude they need to provide shortened labels in STAT axis value tables to accommodate those applications. That would, I guess be a valid use of the STAT design. At the same time, I think it would be unfortunate if applications that are being actively maintained couldn't shed some of their legacy limitations.

MPEGGroup / OpenFontFormat

STAT derived instances in legacy RBIZ applications #11