Open mta452 opened 6 years ago
@mta452 It's unfortunate it's taken almost two years before someone acted on this; I hope you can recall the details that were in your mind to provide clarification.
Wrt Mark-to-Base:
It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph.
Is the ambiguity specifically in regard to the ignoreBaseGlyphs flag only? Or for this case is there a question about other flags as well?
Wrt Mark-to-ligature:
Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.
Similar question: Is the ambiguity specifically in regard to the ignoreLigatures flag?
Wrt Mark-to-Mark:
But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment.
Is the ambiguity specifically in regard to the ignoreMarks flag?
They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere. It's not clear to me what you're saying here. Could you please clarify?
@mta452 : I'm wondering if you're monitoring and could provide clarification on this. Thanks.
Yeah, the question is regarding the lookupFlag
, whether it should be respected to find the previous glyph. While writing SheenFigure, I had to implement the following rules for the correct font behaviour.
Mark-to-Base
, only mark and sequence glyphs should be ignored contrary to what lookupFlag
specifies. This means that if some glyphs were added by Multiple Substitution
, these subsequent glyphs should also be ignored. The previous glyph found with this scheme should be treated as base glyph irrespective of its actual type in GDEF
. Consider g₁ m₁ m₂
-> s₁ s₂ s₃ m₁ m₂
substitution as an example. Here m₁
and m₂
can form Mark-to-Base
positioning with only s₁
.Mark-to-Ligature
, only marks should be ignored contrary to what lookupFlag
specifies. The previous glyph found with this scheme should be treated as ligature irrespective of its actual type in GDEF
.Mark-to-Mark
, the immediate previous glyph should be treated as the second mark irrespective of its actual type in GDEF
. In addition, the previous mark should belong to the same component if Ligature Substitution
occurred. Consider g₁ m₁ m₂ g₂ m₃
-> g₁g₂ m₁ m₂ m₃
substitution as an example. Here m₂
and m₁
can form Mark-to-Mark
positioning but not m₃
and m₂
because they belong to different components of the ligature.@mta452
Thanks for clarifying. Can you please look at #407, which requested clarification of lookup flag behaviour, and the proposed revision to see if the revision provides sufficient clarification for this issue. In particular, that the lookup flag does not filter the current glyph when matching an input sequence.
@mta452 : But look at the proposed revisions shown in discussion of #407, which clarified details related to this issue:
As noted above, lookups are processed for each glyph in the glyph sequence for a string. Each lookup type specifies a glyph pattern to be matched: single glyphs, or sequences of glyphs, depending upon the lookup type. The current glyph in the lookup processing loop is always matched against the first glyph in a lookup’s input glyph sequence pattern. Lookup flags affect pattern matching for other glyphs in the sequence but not the current glyph.
For mark-to-X lookups, there is no backward matching.
You start with the current glyph and search forward only.
We search forward for the mark glyph. Once it is found, we look backward for the base glyph that would be attached with it.
I'm pretty sure that's not the design intent for these lookup types, in which case you are creating an implementation that could lead to non-interoperability.
Similar rules are applied in HarfBuzz. Attaching links to relevant code lines.
Similar rules are applied in .Net Framework https://referencesource.microsoft.com/#PresentationCore/Core/CSharp/MS/Internal/Shaping/Positioning.cs
Mark-to-Base The spec says that to identify the base glyph that combines with a mark, the text-processing client must look backward in the glyph string from the mark to the preceding base glyph.
From the above statement, It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph. There are even cases when only first glyph of a multiple substitution sequence is used as a base glyph.
Mark-to-Ligature The spec says that to position a combining mark using a MarkToLigature attachment subtable, the text-processing client must work backward from the mark to the preceding ligature glyph.
Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.
Mark-to-Mark The spec says that the mark2 glyph that combines with a mark1 glyph is the glyph preceding the mark1 glyph in glyph string order (skipping glyphs according to LookupFlags).
But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment. They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere.
So there are a lot of ambiguitees and the implementations tend to favor the expectations of fonts. I think the spec should be revised to clear this understanding gap once and for all.
Document Details
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.