[font-metrics-api] Revised proposal of font metrics for each character

kojiishi commented 5 years ago

One of the feedback to the previous proposal was about use cases. This proposal tries to solve use cases such as knowing caret positions of characters, or drawing background, text decoration effects, or selection to the text.

This proposal adds characters as an array of metrics for each grapheme cluster in the logical order. This change helps not to expose the details of shaping, which is sometimes complicated and may vary by platforms, as indicated by feedback.

interface TextMetrics {
  readonly attribute FrozenArray<TextMetricUnit> characters;
};

interface TextMetricUnit {
  readonly attribute long codeUnitIndex;
  readonly attribute double position;
  readonly attribute double advance;
  readonly attribute bool isRightToLeft;
};

The codeUnitIndex attribute provides the index of the first code unit of the base character.

When there are ligatures of multiple grapheme clusters, UA may produce one TextMetricUnit for a ligature, or compute metrics for each grapheme cluster in the ligature by using the information in the font, or by synthesizing.

UA may tailor UAX#29 if needed for the caret positioning purpose.

This interface supports RTL, by adding the isRightToLeft attribute which represents the resolved direction of the grapheme cluster, and by adding the position attribute which represents the start-side position of the grapheme cluster from the origin. With position and advance, this interface can represent the ambiguous caret positioning at BiDi boundaries. The position attribute may also help rearrangement of glyphs while shaping some scripts if it occurs across grapheme cluster boundaries.

I think this new proposal covers feedback at whatwg/html#4026 and whatwg/html#3994, applies to both the Font Metrics API and to Canvas Text Metrics API.

Appreciate feedback in advance: @litherum @FremyCompany @dbaron @jfkthame @r12a @annevk @fserb @domenic @eaenet @chrishtr @drott (Can't seem to mention @whatwg/canvas, anyone know how?)

litherum commented 5 years ago

If this is intended to provide caret positions, we should explicitly say so. This is because iOS and macOS explicitly have “caret position” segmenters as distinct from “character” segmenters. https://trac.webkit.org/browser/webkit/trunk/Source/WTF/wtf/spi/cf/CFStringSPI.h#L41

litherum commented 5 years ago

Since the clusters will be in visual order, we should determine if It’s in the direction of the base direction or if it’s always LTR. (Internally, WebKit always uses LTR, and if the base direction is RTL we do some processing to flip it around so our internal visual-order data structures are always LTR.) I’m not arguing for one or the other; just that we need to specify which way it is.

litherum commented 5 years ago

UA may produce one TextMetricUnit for a ligature

If it’s used for caret positions, this is probably wrong.

litherum commented 5 years ago

How is this API associated with a font and string?

kojiishi commented 5 years ago

One feedback I got offline: for RTL, position and advance can be:

position is line-left side, and advance is positive. Sum of them gives the line-right side.
position is inline-start side, and advance is negative. Sum of them gives the inline-end side. Other combinations are also possible but probably one of these two are reasonable.

FremyCompany commented 5 years ago

At first I thought codeUnitIndexes should be a plural and an array (or a codeUnitLength property) and that UAs should not synthetize the individual code points because it's not possible for the API consumer to know that it is not safe to reuse. Though now I think about it, the same holds true if kerning is factored in, so you probably can't reuse anyway. Thoughts?

Also, I think this API needs to return first a runs array consisting of arrays of text runs of same font and directionality, otherwise how do you represent the position/advance of a glyph that is preceded by glyphs of another directionality?

FremyCompany commented 5 years ago

Okay, given we switched to visual order, my previous comment about runs doesn't matter anymore. We also agreed to represent lengths for the codeUnitIndex values, so I think that was addressed today as well.

There is one more thing though, it's not clear when font fallback happens inside the run, and ideally it would be nice to know that.

css-meeting-bot commented 5 years ago

The Houdini Task Force just discussed FontMetrics.

The full IRC log of that discussion

<TabAtkins> Topic: FontMetrics
<TabAtkins> koji: This is about issue 828
<skk> https://github.com/w3c/css-houdini-drafts/issues/828
<emilio> GitHub: https://github.com/w3c/css-houdini-drafts/issues/828
<Rossen> github: https://github.com/w3c/css-houdini-drafts/issues/828
<TabAtkins> koji: Request from authors that want character advance information for each character of a string
<TabAtkins> koji: In canvas API we once tried it, but it had lots of feedback, so this is a revised proposal.
<TabAtkins> koji: use-case is an author with a string, they want to know caret position for drawing between each pair of characters
<TabAtkins> koji: Or decorations on specific characters
<TabAtkins> koji: This revised proposal has a .character FrozenArray with TextMetrics, a new interface we're defining.
<TabAtkins> koji: Each TextMetric has metrics for one grapheme cluster
<TabAtkins> koji: And has an index into the original string, by code unit
<TabAtkins> koji: Has advance, and a boolean indicating rtl vs ltr
<TabAtkins> koji: We've gotten some feedback on it already.
<TabAtkins> koji: From Myles:
<TabAtkins> myles_: When I read this I thought it was about caret positions, not grapheme clusters.
<TabAtkins> myles_: So is it one entry per grapheme cluster, or one per caret position?
<TabAtkins> koji: Ambiguous. Authors I talked to didn't understand the differences.
<TabAtkins> koji: You're right there's some subtle differences between those.
<TabAtkins> koji: As far as I understand the rquest, they often want the character, to draw decorations.
<TabAtkins> koji: Most webapps handle grapheme cluster as the minimum unit to apply stuff to.
<TabAtkins> myles_: It should be noted that there's a jquery plugin to get caret positions.
<TabAtkins> myles_: It inserts spans into the content and then gets client rects.
<TabAtkins> koji: I think in the long run, authors might want to different distinctions for those two. But for initial level, start with caret position.
<TabAtkins> myles_: So why not specific how ligatures work?
<TabAtkins> myles_: You say "the UA *may* produce one cluster for a ligature"
<TabAtkins> koji: Good point.
<TabAtkins> koji: Intention was, as far as we could tell, impls do slightly different things with caret positions there.
<TabAtkins> koji: There was feedback from someone else preferring us to say that it should match UA behavior.
<TabAtkins> myles_: I think that's reasonable.
<TabAtkins> myles_: If intention is to draw a background on a string, and it has an ﬃ
<TabAtkins> myles_: And if they want to draw it just behind the f...
<TabAtkins> dbaron: It seems like every UA has some behavior for carets in the middle of a ligature.
<TabAtkins> dbaron: I hope there's no UAs that totally put it off on one side.
<TabAtkins> dbaron: But even if the bheavior is different, it still seems we could expose what UAs do in that case.
<TabAtkins> dbaron: So you'd have a more interpoperable behavior for the number of *entries* the dev would see in the array.
<TabAtkins> myles_: Right. We shouldn't *fully* specify because in some situations we divide ligature evenly by number of grapheme clusters, but that's not great. We do have a native API to give us correct boundaries, we're just not using it. I'd like the flexibility to get that.
<TabAtkins> koji: Right, like I said earlier, the return value should match what the UA does.
<TabAtkins> dbaron: Right. If you do "fix", I don't want some UAs to return two entries, and other get three, just because some don't provide inter-ligature information.
<TabAtkins> fremy: If you have emojis that are composed of multiple chars, this API then doesn't work.
<TabAtkins> myles_: If your string is "e<family-emoji>", the result is two entries. First is the letter "e", second is the multi-char family emoji.
<TabAtkins> dbaron: And same for e + combining-acute-accent. Those aren't treated like ligatures.
<TabAtkins> myles_: The number of entries in the result is not font-dependent, is what's important here.
<TabAtkins> TabAtkins: What about regional-indicators? (flags)
<TabAtkins> myles_: If your font doesn't support flags, you get one grapheme cluster, it just looks like a pair of characters.
<koji> The next feedback to discuss is "Since the clusters will be in visual order, we should determine if It’s in the direction of the base direction or if it’s always LTR. (Internally, WebKit always uses LTR, and if the base direction is RTL we do some processing to flip it around so our internal visual-order data structures are always LTR.) I’m not arguing for one or the other; just that we need to specify which way it is."
<TabAtkins> myles_: This should be visual order, right?
<TabAtkins> koji: Request is to determine advance of source string.
<TabAtkins> koji: So in my proposal the char is in logical order, not visual.
<TabAtkins> myles_: I thought you said a use-case was putting a background behind part of the string, how od you do that if it's in logical order?
<TabAtkins> koji: By making chars in logical order, author can determine where each character is, then the author can process it themselves.
<TabAtkins> myles_: Does that mean at a fragment boundary you could get a really big negative advance?
<TabAtkins> fremy: This was part of my feedback as well.
<TabAtkins> fremy: Is the advance negative in that case?
<TabAtkins> fremy: If you want logical order, this needs to break across bidi, or use visual order.
<TabAtkins> myles_: The JS i18n APIs would probably be interested in adding some APIs for this if you really want it in logical order. But I think it would be best in visual order.
<TabAtkins> koji: This interface has ltr vs rtl, so author can control this somewhat themselves.
<heycam> q+ once these current issues are finished
<heycam> q+ to say something once these current issues are finished
<TabAtkins> TabAtkins: You need to know how many chars you're formatting to do visual order, right?
<TabAtkins> myles_: Right, you can only reasonably call this *after* line-breaking.
<TabAtkins> koji: So the consensus is to use visual order, and add number of code units for each TextMetric unit.
<TabAtkins> myles_: You may want both "codeUnitIndex" *and* "lengthOfCluster", since it's in visual order.
<heycam> with https://drafts.css-houdini.org/font-metrics-api/#measure-api
<TabAtkins> myles_: Most important question I have is how you associate this call with a font.
<TabAtkins> fremy: There is the measureTExt function in the canvas api
<TabAtkins> myles_: Is this a new thing, or a repalcement?
<TabAtkins> koji: We want to sync this with the canvas api.
<TabAtkins> koji: We'll port this to canvas api once we agree on it.
<TabAtkins> heycam: So the FontMetrics API spec has a new, separate measureText function.
<TabAtkins> koji: Proposal is to add .characters to both FontMetrics and Canvas API.
<TabAtkins> fremy: So this is a mixin that will be used in both interfaces?
<TabAtkins> koji: Yes.
<Rossen> q?
<TabAtkins> myles_: Next feedback - unsure if this makes sense to run on an arbitrary element, since arbitrary elems can have children?
<TabAtkins> heycam: That's my question, yeah - what does the index count into? What about dipslay:none? etc
<Rossen> ack heycam
<Zakim> heycam, you wanted to say something once these current issues are finished
<TabAtkins> heycam: I think there are similar index issues with the string API. You have whitespace collapsing/trimming. Need precise definition of what indexes are used.
<TabAtkins> koji: I udnerstand that part isn't defined here. If we applied this to element.meausreText(), we have to define that.
<TabAtkins> koji: Currently the proposal only covers measuring a string.
<TabAtkins> koji: I'll work on a proposal to define the element case.
<krit> q+
<TabAtkins> heycam: In SVG we have a silly character-positioning API, and it's pretty annoying.
<TabAtkins> heycam: If there was a way to avoid all that and just stick to strings, that would be nice.
<TabAtkins> myles_: So I'd like to propose removing the measureElement function. Just keep it to strings for now.
<TabAtkins> myles_: There's more complications, like letter-spacing and such.
<TabAtkins> heycam: And text-transform - one character suddenly becomes two grapheme clusters, etc.
<TabAtkins> myles_: Another way to do it is not take StyleMap, but just a small set of properties you want to handle, like font-family and font-weight. That's what the canvas api does.
<TabAtkins> myles_: You can't specify font-variation, etc.
<TabAtkins> heycam: Ultimately it depends on the use-case.
<TabAtkins> heycam: If they want to measure stuff in the DOM, but they can't measure everything, maybe not useful.
<TabAtkins> eae: Majority of use-cases we've observed are for out-of-dom measurements.
<TabAtkins> Rossen: So many things we could resolve on, lot of feedback.
<TabAtkins> Rossen: I see a request to remove measureElement().
<TabAtkins> Rossen: We need to change order to visual.
<TabAtkins> Rossen: Add .lengthOfCluster
<TabAtkins> heycam: Define how whitespace collapsing, text-transform, etc that cause idfficult mappings between characters and clusters.
<TabAtkins> myles_: And change how ligatures and metrics interact.
<TabAtkins> krit: SVGWG is also looking at this problem for the counting part. At the moment svg1.1 says we should use unicode codepoints, that's not very consistent. In our investigation we found grapheme clusters aren't well-specified.
<TabAtkins> krit: There might be bigger issues.
<TabAtkins> myles_: Unicode *tries* to specify what grapheme clusters is. If that's insufficient, we have larger problems.
<TabAtkins> krit: We want there to be alignemtn between fontmetrics and SVG glyph counting.
<TabAtkins> myles_: We've been talkinga bout a number of things that need to change, but in general this is a good direction to go.
<TabAtkins> krit: Agree, very useful.
<krit> ack krit
<TabAtkins> Rossen: So please take feedback and reflect it into the proposal, we can discuss it over the issue in the future.
<TabAtkins> myles_: One more -
<TabAtkins> myles_: It seems totally reasonable for an author to want to use this api for things like caret positions as well as grapheme clusters.
<TabAtkins> myles_: I imagine this'll be extended to other segmenters in the future, so keep that in mind.
<TabAtkins> koji: Yeah, looking for opinions on that.
<TabAtkins> koji: Currently the proposal is to add an attribute, and if you want to add different segmenters, maybe make it a function?
<TabAtkins> koji: Or add other attributes that segment differently.
<TabAtkins> myles_: Will need time to think about it.
<iank_> ScribeNick: iank_
<myles_> koji: I really like how this doesn’t expose teh concept fo a glyph
<iank_> glazou: Thanks for greg for starting document.

litherum commented 5 years ago

Last but not least, it's not clear when font fallback happens inside the run, and ideally it would be nice to know that.

Presumably this should match the UA.

Perhaps one way to spec this is “pretend you make an iframe with this string inside it, return data as-if you did that”

eaenet commented 5 years ago

Perhaps one way to spec this is “pretend you make an iframe with this string inside it, return data as-if you did that”

I like that suggestion and agree that it's important that it matches the UA fallback handling.

jfkthame commented 5 years ago

I think this API needs to return first a runs array consisting of arrays of text runs of same font and directionality

This sounds like a potential fingerprinting vector, making it easier for a page to probe details of the machine's font configuration.

kojiishi commented 5 years ago

@FremyCompany:

...safe to reuse. Though now I think about it, the same holds true if kerning is factored in, so you probably can't reuse anyway. Thoughts?

Yeah, reuse is not possible by kerning, joining, and all such shaping effects. In Blink internally we cached metrics for each space-delimited word, but had problem in kerning between space character and letters. To determine the correct reusability is not an easy task, for this API, we assume authors call the API for all their string without considering reuse.

I think this API needs to return first a runs array consisting of arrays of text runs of same font and directionality

The idea to return runs was raised by other people too, and I think it's nice and clean. But figuring out how to segment runs isn't easy. The directionality and fonts are good ones, one may want to split at script boundary, and more. The current proposal tries to avoid that discussion by returning a flat array with all such properties exposed (fonts are not exposed yet but we probably want to add in future), so that authors can build runs if needed.

litherum commented 5 years ago

I think this API needs to return first a runs array consisting of arrays of text runs of same font and directionality

This sounds like a potential fingerprinting vector, making it easier for a page to probe details of the machine's font configuration.

It’s already discoverable in JavaScript by creating <span>s with different contents and styling.

We solved this in WebKit by ignoring all user-installed fonts, making everyone* appear to have the same set of fonts installed.

It’s also worth linking to https://github.com/tc39/proposal-intl-segmenter. This proposal allows web developers to do their own line breaking.

* for some definition of “everyone”

litherum commented 5 years ago

During the F2F, we stated how the intended use case for this API is drawing a background behind a particular word in a line of text. However, if a developer wants to draw a background behind a word, he/she can just make 2 calls to measureText() to get the width of the entire string before the word, and the width of the word itself. There usually (always?) isn't complex shaping across word boundaries that would make this approach incorrect. You could also use this API to draw a background around an individual character inside a word, but it's hard to imagine that developers are clamoring to do that.

I can't think of any other use cases that would be satisfied by this proposal. You can't draw a blinking caret, since this API measures grapheme clusters, not caret positions. You can't paint individual glyphs at specific positions. The call doesn't return information about font fallback or about how the UBA rearranged your string.

I also don't think that it would be a good idea for the Web Platform to move in the direction of exposing tons of typographic information. The best way to perform text layout is to use HTML elements and CSS. An author trying to do it themself with Javascript would almost certainly be both slower, less correct, and less accessible than doing it with the browser's engine.

So, I'm sympathetic to solving a specific use case that developers are asking for, but I'm less sympathetic about this particular use case (because it can already be solved with existing APIs), and I'm even less sympathetic about the general direction of supporting developers implement their own paragraph layout in script.

fserb commented 5 years ago

I've added some questions related to a Canvas compatibility API here: https://github.com/w3c/css-houdini-drafts/issues/832

w3c / css-houdini-drafts

[font-metrics-api] Revised proposal of font metrics for each character #828