mta452 commented 6 years ago

Mark-to-Base The spec says that to identify the base glyph that combines with a mark, the text-processing client must look backward in the glyph string from the mark to the preceding base glyph.

From the above statement, It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph. There are even cases when only first glyph of a multiple substitution sequence is used as a base glyph.

Mark-to-Ligature The spec says that to position a combining mark using a MarkToLigature attachment subtable, the text-processing client must work backward from the mark to the preceding ligature glyph.

Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.

Mark-to-Mark The spec says that the mark2 glyph that combines with a mark1 glyph is the glyph preceding the mark1 glyph in glyph string order (skipping glyphs according to LookupFlags).

But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment. They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere.

So there are a lot of ambiguitees and the implementations tend to favor the expectations of fonts. I think the spec should be revised to clear this understanding gap once and for all.

Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

ID: 3f748ff4-44a6-6395-e0ce-d249b8c9541b
Version Independent ID: b80d0359-5657-632b-29e8-142c06902001
Content: GPOS — Glyph Positioning Table - Typography
Content Source: typographydocs/opentype/spec/gpos.md
Product: windows
GitHub Login: @PeterCon
Microsoft Alias: PeterCon

PeterCon commented 4 years ago

@mta452 It's unfortunate it's taken almost two years before someone acted on this; I hope you can recall the details that were in your mind to provide clarification.

Wrt Mark-to-Base:

It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph.

Is the ambiguity specifically in regard to the ignoreBaseGlyphs flag only? Or for this case is there a question about other flags as well?

Wrt Mark-to-ligature:

Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.

Similar question: Is the ambiguity specifically in regard to the ignoreLigatures flag?

Wrt Mark-to-Mark:

But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment.

Is the ambiguity specifically in regard to the ignoreMarks flag?

They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere. It's not clear to me what you're saying here. Could you please clarify?

PeterCon commented 4 years ago

@mta452 : I'm wondering if you're monitoring and could provide clarification on this. Thanks.

mta452 commented 4 years ago

Yeah, the question is regarding the lookupFlag, whether it should be respected to find the previous glyph. While writing SheenFigure, I had to implement the following rules for the correct font behaviour.

In Mark-to-Base, only mark and sequence glyphs should be ignored contrary to what lookupFlag specifies. This means that if some glyphs were added by Multiple Substitution, these subsequent glyphs should also be ignored. The previous glyph found with this scheme should be treated as base glyph irrespective of its actual type in GDEF. Consider g₁ m₁ m₂ -> s₁ s₂ s₃ m₁ m₂ substitution as an example. Here m₁ and m₂ can form Mark-to-Base positioning with only s₁.
In Mark-to-Ligature, only marks should be ignored contrary to what lookupFlag specifies. The previous glyph found with this scheme should be treated as ligature irrespective of its actual type in GDEF.
In Mark-to-Mark, the immediate previous glyph should be treated as the second mark irrespective of its actual type in GDEF. In addition, the previous mark should belong to the same component if Ligature Substitution occurred. Consider g₁ m₁ m₂ g₂ m₃ -> g₁g₂ m₁ m₂ m₃ substitution as an example. Here m₂ and m₁ can form Mark-to-Mark positioning but not m₃ and m₂ because they belong to different components of the ligature.

PeterCon commented 4 years ago

@mta452

Thanks for clarifying. Can you please look at #407, which requested clarification of lookup flag behaviour, and the proposed revision to see if the revision provides sufficient clarification for this issue. In particular, that the lookup flag does not filter the current glyph when matching an input sequence.

mta452 commented 4 years ago

407 addresses the issues of context lookups. This one is related to finding the previous glyph in GPOS lookup types 4, 5 and 6.

PeterCon commented 4 years ago

@mta452 : But look at the proposed revisions shown in discussion of #407, which clarified details related to this issue:

As noted above, lookups are processed for each glyph in the glyph sequence for a string. Each lookup type specifies a glyph pattern to be matched: single glyphs, or sequences of glyphs, depending upon the lookup type. The current glyph in the lookup processing loop is always matched against the first glyph in a lookup’s input glyph sequence pattern. Lookup flags affect pattern matching for other glyphs in the sequence but not the current glyph.

mta452 commented 4 years ago

407 is related to finding the next glyph based on the lookup flag. This one is related to backward matching. Perhaps some font developers can better share their findings.

PeterCon commented 4 years ago

For mark-to-X lookups, there is no backward matching.

PeterCon commented 4 years ago

You start with the current glyph and search forward only.

mta452 commented 4 years ago

We search forward for the mark glyph. Once it is found, we look backward for the base glyph that would be attached with it.

PeterCon commented 4 years ago

I'm pretty sure that's not the design intent for these lookup types, in which case you are creating an implementation that could lead to non-interoperability.

mta452 commented 4 years ago

Similar rules are applied in HarfBuzz. Attaching links to relevant code lines.

Mark-to-Base: https://github.com/harfbuzz/harfbuzz/blob/2.7.2/src/hb-ot-layout-gpos-table.hh#L1874
Mark-to-Ligature: https://github.com/harfbuzz/harfbuzz/blob/2.7.2/src/hb-ot-layout-gpos-table.hh#L2097
Mark-to-Mark: https://github.com/harfbuzz/harfbuzz/blob/2.7.2/src/hb-ot-layout-gpos-table.hh#L2251

mta452 commented 4 years ago

Similar rules are applied in .Net Framework https://referencesource.microsoft.com/#PresentationCore/Core/CSharp/MS/Internal/Shaping/Positioning.cs

MicrosoftDocs / typography-issues

Preceeding glyph identification in Mark-to-(Any) positioning #102

Document Details

407 addresses the issues of context lookups. This one is related to finding the previous glyph in GPOS lookup types 4, 5 and 6.

407 is related to finding the next glyph based on the lookup flag. This one is related to backward matching. Perhaps some font developers can better share their findings.