MicrosoftDocs / typography-issues

Creative Commons Attribution 4.0 International
47 stars 21 forks source link

Preceeding glyph identification in Mark-to-(Any) positioning #102

Open mta452 opened 6 years ago

mta452 commented 6 years ago

Mark-to-Base The spec says that to identify the base glyph that combines with a mark, the text-processing client must look backward in the glyph string from the mark to the preceding base glyph.

From the above statement, It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph. There are even cases when only first glyph of a multiple substitution sequence is used as a base glyph.

Mark-to-Ligature The spec says that to position a combining mark using a MarkToLigature attachment subtable, the text-processing client must work backward from the mark to the preceding ligature glyph.

Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.

Mark-to-Mark The spec says that the mark2 glyph that combines with a mark1 glyph is the glyph preceding the mark1 glyph in glyph string order (skipping glyphs according to LookupFlags).

But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment. They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere.

So there are a lot of ambiguitees and the implementations tend to favor the expectations of fonts. I think the spec should be revised to clear this understanding gap once and for all.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

PeterCon commented 4 years ago

@mta452 It's unfortunate it's taken almost two years before someone acted on this; I hope you can recall the details that were in your mind to provide clarification.

Wrt Mark-to-Base:

It seems that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a base glyph. But the implementations do not respect the lookup flag, ignore only mark glyphs, and assume the previous glyph to be a base glyph.

Is the ambiguity specifically in regard to the ignoreBaseGlyphs flag only? Or for this case is there a question about other flags as well?

Wrt Mark-to-ligature:

Again, the above statement implies that the text-processing client should respect the lookup flag, and use the previous glyph for attachment if it is a ligature glyph. But the implementations ignore only mark glyphs and assume the previous glyph to be a ligature glyph.

Similar question: Is the ambiguity specifically in regard to the ignoreLigatures flag?

Wrt Mark-to-Mark:

But the implementations totally disregard the lookup flag and use the immediate previous glyph for attachment.

Is the ambiguity specifically in regard to the ignoreMarks flag?

They also make sure that both glyphs belong to the same component of a ligature even though it is not specified anywhere. It's not clear to me what you're saying here. Could you please clarify?

PeterCon commented 4 years ago

@mta452 : I'm wondering if you're monitoring and could provide clarification on this. Thanks.

mta452 commented 4 years ago

Yeah, the question is regarding the lookupFlag, whether it should be respected to find the previous glyph. While writing SheenFigure, I had to implement the following rules for the correct font behaviour.

PeterCon commented 4 years ago

@mta452

Thanks for clarifying. Can you please look at #407, which requested clarification of lookup flag behaviour, and the proposed revision to see if the revision provides sufficient clarification for this issue. In particular, that the lookup flag does not filter the current glyph when matching an input sequence.

mta452 commented 4 years ago

407 addresses the issues of context lookups. This one is related to finding the previous glyph in GPOS lookup types 4, 5 and 6.

PeterCon commented 4 years ago

@mta452 : But look at the proposed revisions shown in discussion of #407, which clarified details related to this issue:

As noted above, lookups are processed for each glyph in the glyph sequence for a string. Each lookup type specifies a glyph pattern to be matched: single glyphs, or sequences of glyphs, depending upon the lookup type. The current glyph in the lookup processing loop is always matched against the first glyph in a lookup’s input glyph sequence pattern. Lookup flags affect pattern matching for other glyphs in the sequence but not the current glyph.

mta452 commented 4 years ago

407 is related to finding the next glyph based on the lookup flag. This one is related to backward matching. Perhaps some font developers can better share their findings.

PeterCon commented 4 years ago

For mark-to-X lookups, there is no backward matching.

PeterCon commented 4 years ago

You start with the current glyph and search forward only.

mta452 commented 4 years ago

We search forward for the mark glyph. Once it is found, we look backward for the base glyph that would be attached with it.

PeterCon commented 4 years ago

I'm pretty sure that's not the design intent for these lookup types, in which case you are creating an implementation that could lead to non-interoperability.

mta452 commented 4 years ago

Similar rules are applied in HarfBuzz. Attaching links to relevant code lines.

mta452 commented 4 years ago

Similar rules are applied in .Net Framework https://referencesource.microsoft.com/#PresentationCore/Core/CSharp/MS/Internal/Shaping/Positioning.cs