MPEGGroup / OpenFontFormat

Official MPEG repository to discuss issues on Open Font Format (ISO/IEC 14496-22)
32 stars 6 forks source link

Variable substitution: Too many regions with mixed features #53

Open skef opened 1 year ago

skef commented 1 year ago

More complete document with background: conditions.pdf

Suppose that you have a feature with three substitutions on one axis, as well as a different feature with three entirely unrelated substitutions on a different axis. For example, dollar changes at wght -.5, cent at wght 0, and euro at wght .5, while one changes at foo -.5, two at foo 0, and three at foo .5.

Although these substitutions do not seem to be related in the abstract, and will probably not appear to be related when encoded in a feature file, the feature compiler must treat them as related when building the GSUB feature variation subtable in its present form. This is because there is only one unified list of feature variation records per table (GSUB or GPOS). So the above pattern of substitution will not result in 6 regions plus the default (3 for wght, 3 for foo), but 15.

cond_fig4ink

With a "logical" encoding those would be (with redundant conditions omitted):

    D => -1 <= wght <= -0.5 
    C => -1 <= wght <=  0
    E => -1 <= wght <=  0.5
    1 => -1 <= foo  <= -0.5
    2 => -1 <= foo  <=  0
    3 => -1 <= foo  <=  0.5

     1)  D & 1 : dollar.sub , cent.sub , euro.sub , one.sub , two.sub , three.sub
     2)  C & 1 : dollar     , cent.sub , euro.sub , one.sub , two.sub , three.sub
     3)  E & 1 : dollar     , cent     , euro.sub , one.sub , two.sub , three.sub
     4)      1 : dollar     , cent     , euro     , one.sub , two.sub , three.sub
     5)  D & 2 : dollar.sub , cent.sub , euro.sub , one     , two.sub , three.sub
     6)  C & 2 : dollar     , cent.sub , euro.sub , one     , two.sub , three.sub
     7)  E & 2 : dollar     , cent     , euro.sub , one     , two.sub , three.sub
     8)      2 : dollar     , cent     , euro     , one     , two.sub , three.sub
     9)  D & 3 : dollar.sub , cent.sub , euro.sub , one     , two     , three.sub
    10)  C & 3 : dollar     , cent.sub , euro.sub , one     , two     , three.sub
    11)  E & 3 : dollar     , cent     , euro.sub , one     , two     , three.sub
    12)      3 : dollar     , cent     , euro     , one     , two     , three.sub
    13)  D     : dollar.sub , cent.sub , euro.sub , one     , two     , three    
    14)  C     : dollar     , cent.sub , euro.sub , one     , two     , three    
    15)  E     : dollar     , cent     , euro.sub , one     , two     , three    

    def   : dollar     , cent     , euro     , one     , two     , three    

More generally, this means that whatever features use this table, the compiler must carve up the geometry across all of them. Therefore the scaling problem is just not within a feature but across all features.

This seems like a flaw in the current specification, especially because it is not an inherent part of the mechanism.

Sketch of solution

Conceptually, all that is needed to solve the problem is advance knowledge of which features are encoded among the feature variation records. This list could be encoded by sorted tag in some new subtable.

Then, as the Feature Variation records are examined in order, instead of stopping at the first match the search stops when a record corresponding to each feature in the initial list is found. That way the entries for different layout features can be interspersed without interfering with one another. If the feature list is present you use the new search convention, if not you use the old one.

behdad commented 1 year ago

Then, as the Feature Variation records are examined in order, instead of stopping at the first match the search stops when a record corresponding to each feature in the initial list is found. That way the entries for different layout features can be interspersed without interfering with one another. If the feature list is present you use the new search convention, if not you use the old one.

The feature-level might be too coarse granularity. There's only so many features one can use... For example, if one wants to put all in rvrn, they can't use the new mechanism. Let's think a bit more and see what we can come up with.

skef commented 1 year ago

Fair enough - this idea is at the level of damage control. If we don't do something better soon, I think it's preferable to add this, if there is something better we may not need it.

behdad commented 1 year ago

Here's one idea: A new FeatureTableSubstitution version that simply enables an extra lookup for the feature, and search is not stopped.

The problem with this is that it doesn't allow disabling lookups already in the feature.

skef commented 1 year ago

Interesting.

You need the "original" feature in order to have a featureIndex to substitute for (and to match the script and language system, but that could just contain any lookups that apply at every position, or be empty if there are no such lookups. Then you add to that list by going through the new FeatureTableSubstitution table and adding any lookups with matching conditions.

Ordering doesn't matter (if I'm remembering right) because the lookups are always put back into total lookup order downstream.

You could also add some sort of termination condition to this. Say that when you encounter a zero as the alternateFeatureOffset, you stop. Then you could sprinkle in empty condition set entries to turn off searches for particular features once you know there are no more entries for them (if that mattered).

Seems a lot better for normal uses and at least right off the bat I'm not seeing any obvious holes.

behdad commented 1 year ago

Interesting.

You need the "original" feature in order to have a featureIndex to substitute for (and to match the script and language system, but that could just contain any lookups that apply at every position, or be empty if there are no such lookups. Then you add to that list by going through the new FeatureTableSubstitution table and adding any lookups with matching conditions.

The main issue would be that most implementations prefer not to have to process feature-variations at the default location. If we lift that, then yes, what you describe should work IMO.

Ordering doesn't matter (if I'm remembering right) because the lookups are always put back into total lookup order downstream.

Correct.

You could also add some sort of termination condition to this. Say that when you encounter a zero as the alternateFeatureOffset, you stop. Then you could sprinkle in empty condition set entries to turn off searches for particular features once you know there are no more entries for them (if that mattered).

Sgtm.

Seems a lot better for normal uses and at least right off the bat I'm not seeing any obvious holes.

skef commented 1 year ago

OK, one hole: This design doesn't follow the Microsoft/Apple convention of preserving the behavior of the default instance in a context that doesn't understand variable fonts. But maybe the group would be more flexible on that now.

[Oops, scooped.]

skef commented 1 year ago

OK, suppose we do want to preserve that behavior for the sake of consistency. Then:

  1. Rather than have the original feature contain the "common" entries, just have it represent the locations used for the default and ignore it entirely (other than making use of its index) when you do the search.
  2. Add the common entries in right at the start of the search (by convention) using an empty condition set with offsets to feature tables with the shared elements.

Done.

skef commented 1 year ago

Thinking about this a bit more, the change from terminating at the first matching element (either "globally" or per-feature-index) to continuing means that what I described as the "logical" analysis of conditions doesn't apply anymore. That's fine for many scenarios but we should think through the cases a bit more.

I'm wondering things like whether it would be valuable to be able to negate a condition included in a set, or is that just an annoyance. (This seems more important with condition values, but we could also just modify that spec (assuming it goes through) to allow the negation of the calculated value.)

behdad commented 1 year ago

If I understand you correctly, negation should be easy if we just say if filterRangeMinValue is greater than filterRangeMaxValue then the condition is negated.

skef commented 1 year ago

If I understand you correctly, negation should be easy if we just say if filterRangeMinValue is greater than filterRangeMaxValue then the condition is negated.

Ah, that's clever. So if the Min is -1 you replace it with an adjusted (by the min F2DOT14 difference) Max value, do the equivalent if Max is 1, and if both are used you swap them. Maybe for consistency and clarity of dumped values we can make the reversed case inclusive as well, adjusting the values as we swap them. Then we can argue about whether the format needs a bump or if we can just specify that these new values can only be used in tables with sufficiently high versions (or both).

So let's assume we have a convenient negation for every condition (I'll add a note to the other issue). Let's also explicitly note that the output of the search is a set of lookups, so if you add the same one multiple times it's the same as adding it once.

Thinking about this more, I think this system would be formally complete, in that it would allow one to include a lookup according to any (standard) logical formula of conditions. That just follows from disjunctive normal form. And while I don't think it would be necessary (or desirable) to support arbitrary boolean formulas in, say, feature files, it does mean that any tricky situation can just be internally "phrased" as an arbitrary expression and then mechanically reduced to DNF, which is a good fallback. (BoolStuff is GNU-licensed but reduction to DNF is hardly rocket science.)

I think the remaining question would be whether we would want any extensions to the system to make it more practical to use and understand. Having thought about this a bit I think I know what the candidate would be. The practical problem with this new system is that it's harder (or, given what I've just said, maybe "wordier") to handle fallbacks or alternatives.

Consider the other stereotypical VF case: you want a sudden change in the kerning between "T" and "o". And that you want to choose that point independently on three axes, and rather than being clever with a single value you want to use substitution (for whatever reasons). So you'll have one variable kerning value for one case and another for the other case. If you have conditions to express where you want one value you can just use them. The thing we've made difficult (or "wordy") is how to positively express where to put the other one. There are three equivalent ways of looking at the expression: it's a disjunction of negations, it's the negation of a condition set, or it's an "else" on a condition set.

Of these the negation of a condition set seems most salient to this system we're imagining. Given the lack of intermediate versioning, that would probably mean bumping the major version on the FeatureVariations table and then adding flags to either the FeatureVariationRecord or the ConditionSet, one of which can be "negate the condition set".

So according to my current, hours-old view, this would leave us here:

  1. It seems, tentatively, like this system we're imagining can do whatever we need it to as long as we have a practical form of condition negation.
  2. We might want to add an "else" or a "condition set negation" feature to make things more understandable and compact.
  3. If we don't add a 2-type feature it's more likely that implementations supporting anything complex will need to be capable of reduction to DNF to handle such cases.
behdad commented 1 year ago
  • We might want to add an "else" or a "condition set negation" feature to make things more understandable and compact.

"else" sounds good to me.

skef commented 1 year ago

"else" sounds good to me.

In that case I suppose we would rev the major version on the FeatureVariations table and specify that that version has a FeatureVariationRecord something like (ignoring my awkward choice of terminology):

Offset32    conditionSetOffset                            Offset to a condition set table ...
Offset32    trueLookupAdditionsTableOffset   table of lookups to add when the condition set is true 
Offset32    falseLookupAdditionsTableOffset  table of lookups to add when the condition set is false

And specify that either of the LookupAdditionsTables can have a 0 offset, which means there isn't one. A LookupAdditionsTable would just be a reworked FeatureTableSubstitution record that mostly changes the interpretation fields rather than their format.

skef commented 1 year ago

@behdad Do you think people will care if you can change the feature parameters by position with this mechanism or would that be so esoteric a need that we could just require that the parameters of the default feature table always apply?

(Or, I suppose we could consider allowing mixing of the two mechanisms so that if there are entries of the existing kind you use the parameters from them.

behdad commented 1 year ago

@behdad Do you think people will care if you can change the feature parameters by position with this mechanism or would that be so esoteric a need that we could just require that the parameters of the default feature table always apply?

That's a good question. In HarfBuzz we currently only look at the default feature (in fact we have it as a face function, not a font with variation settings. We can spec it either way I think.

(Or, I suppose we could consider allowing mixing of the two mechanisms so that if there are entries of the existing kind you use the parameters from them.

Yeah I think that's fine.

skef commented 1 year ago

I've been playing around with breaking this sketch down into subtables and records. One (tentative) decision I made is that with the move away from terminating at first match, I think it makes more sense to have the conditions sets below the feature index rather than above it. It may involve a bit of duplication in the font but this way you only have to go through the lists for those features that are active.

Anyway, here's a very rough doc: new_substitution.pdf

I suppose we could give the new FeatureVariations table version 1.1 by putting the lookupVariationsOffset after the array, although that feels a bit icky.

behdad commented 1 year ago

Thanks Skef. Looks good to me.

Lorp commented 1 year ago

@behdad: "most implementations prefer not to have to process feature-variations at the default location"

What is your basis for this statement? Harfbuzz and Apple, at least, do process feature-variations at the default location.

In the attached test font feavartest.ttf.zip, /A (square) substitutes for /A.alt (circle) between wght = 1 (-1.0) and wght = 700 (0.5). It renders as a circle at default (wght = 400) in macOS and Harfbuzz (FontGoggles).

Note also this ConditionTable in the TTX of Bahnschrift (a Microsoft system font), which, because it straddles 0, implies that Microsoft processes feature-variations at default.

<ConditionTable index="0" Format="1">
  <AxisIndex value="0"/>
  <FilterRangeMinValue value="-0.8"/>
  <FilterRangeMaxValue value="1.0"/>
</ConditionTable>
behdad commented 1 year ago

What is your basis for this statement? Harfbuzz and Apple, at least, do process feature-variations at the default location.

You are indeed correct. I was surprised but checked the code and indeed it works.

Anyway; that's even better for our new design. Thanks for pointing it out.

skef commented 1 year ago

@Lorp Even if most implementations that have variable font support are processing the variable-font-specific tables at the default location, there's still the question of allowing the default location to render correctly on systems that don't have any variable font support. Quite a bit of the existing design of variable fonts revolves around that issue, and it's hard to say if and when it's proponents will ease up on that.

skef commented 1 year ago

In any case, moving the feature indices above the condition sets also created a place for feature-specific flags. In the PDF write-up I added one to indicate whether the lookup indices from the "current" feature table (either the one in GSUB or the one selected by the existing mechanism -- usually the former) should be copied into the initial set. Having that control should make the issue moot -- copy them when it makes sense, don't copy them when it doesn't.

behdad commented 1 year ago

Sgtm. Make them sorted by feature tag so it's easier to lookup and we're good I think.

skef commented 1 year ago

With luck this issue will be superseded by #57