WICG / canvas-formatted-text

Other
80 stars 17 forks source link

N:M mappings and other thoughts #31

Open travisleithead opened 2 years ago

travisleithead commented 2 years ago

(This is internal Microsoft feedback from a review of the spec by resident expert @petercon. Re-posting here with permission.)


Graphemes are character combinations… that result in a single glyph and hence should not be broken up.

It’s not true that graphemes always result in a single glyph. It’s very often the case they don’t, and for some scripts (e.g., Indic) it’s usually not the case.

I think the real significance of grapheme clusters, rather, is that they are minimal units for selection, and maybe also for editing. In particular, if you wanted to select a sub-string within a cluster, you would encounter problems with how to display the selection, and need to start supporting split carets.

(Instead of a single caret line, a split caret has two half carets: one to show what the current insertion point precedes, and another to show what it follows. Back in the 1990s, some apps using Apple’s QuickDraw GX APIs implemented split carets, but they have usability problems and didn’t survive.)

Moreover, sometimes multiple grapheme clusters might get displayed as a single glyph. That’s not uncommon for Arabic, for example. OpenType Layout tables support font data that can inform (with certain limitations) where a single ligature glyph can be divided into separate grapheme clusters—so that the app can know where within a ligature it can display a caret.

There was some related discussion in your recent wicg meeting:

Julia: need to consider early in the design process that one character can produce multiple glyphs… Travis: trying to avoid M:N mapping between characters and glyphs… …

You really can’t avoid the M:N mapping. In GDI, the ScriptShape function has an out param that returns a logical cluster array: for each character, what is the index for the corresponding glyph in the result glyph sequence. In DWrite, IDWriteTextAnalyzer::GetGlyphs similarly returns a char/glyph cluster mapping. IDWriteTextLayout is higher level; instead, it has methods for hit testing, such as HitTestTextPosition.

fserb: need to know the reason for the n-to-n mapping… yjbanov_: Folks may want to render the accent marks with different fonts!

I’ve never heard that as a requirement, and it’s inherently problematic: one font doesn’t have information about how to position its glyphs relative to glyphs from another font.

fserb: there are languages in which the n-to-n mapping doesn't exist.

I’m not sure what was meant; and I’m not sure what it could possibly mean.

yjbanov commented 2 years ago

I think the quote about rendering accent marks with different fonts might have been misattributed to me. If I implied that, then I'm sorry, and I'm happily withdrawing this requirement!

I think it's reasonable to give the text engine the freedom to do M:N mapping for maximum accuracy. The API should simply be clear about how that relates to the original runs of text that the developer pushed. I think we could use a simple set of rules:

This is simple enough to remember. It could lead to programming errors, such as the word "offline" rendering "ff" as a ligature or as two separate glyphs depending on whether the developer pushed "offline" as a single run, or pushed "of" and "fline" separately. However, this has an easy workaround: if you want the engine to control the merging and splitting of characters, simply combine the characters into one string before calling text.textruns.push and let the engine handle it for you.

A second reason for not merging text runs is because the developer may intend to render characters using different styles (say different colors). The text engine may not have enough information to know the intent here. When using CSS, such intent may be communicated by supplying different CSS styles to different runs. However, when using WebGL, the non-shaping styling information may be completely separate.