Open schenney-chromium opened 2 weeks ago
I think TextCluster
could be a dictionary, instead of a non-constructible but mutable class. Do you agree, or is there something I'm missing?
The TextCluster
object must also store internally the context's styles like font
and I guess all of the CanvasTextDrawingStyles
, as they were when measureText()
has been called, so I suppose an interface makes sense?
However it's unclear why it's mutable indeed, nor why the text
is exposed and not just stored internally too. IIUC, it's the original text that was passed to measureText
, so I suppose authors should already know it.
Also the x
and y
arguments to fillTextCluster()
are a bit unclear to me. I guess they're the equivalent of fillText
's x
and y
, but in that case how come they can be optional?
It seems this intends to cover some of the same use cases as #10650.
cc @whatwg/canvas @khushalsagar
I think
TextCluster
could be a dictionary, instead of a non-constructible but mutable class. Do you agree, or is there something I'm missing?
You're right. We originally thought of as a interface since the underlying object has to save references to other objects and it felt more natural, but having it be a dictionary is more useful from the user's side. Plus, a standard dictionary would allow to create modified copies via the spread syntax. I'll update it.
The
TextCluster
object must also store internally the context's styles likefont
and I guess all of theCanvasTextDrawingStyles
, as they were whenmeasureText()
has been called, so I suppose an interface makes sense?
Indeed, that is the idea. In the current prototype in Chromium (CL is currently under review), the TextCluster
object holds a reference to the font in order to replicate the text accurately even if the font set in the context has changed (which is also relevant if the font wasn't fully loaded when measureText()
was called). We had originally thought of only the font but I agree that it's better to include all CanvasTextDrawingStyles
.
However it's unclear why it's mutable indeed, nor why the
text
is exposed and not just stored internally too. IIUC, it's the original text that was passed tomeasureText
, so I suppose authors should already know it.
We think it can be useful to allow the creation of TextClusters
directly from JS if desired. In that case it would be the author's responsibility to guarantee that no ligatures or glyphs are separated. For that to work the whole text is needed as context to enable the correct shaping of the ligatures or other context-dependent modifications that the font can define. For all other attributes, I agree that we should make them immutable. I will update that too.
Also the
x
andy
arguments tofillTextCluster()
are a bit unclear to me. I guess they're the equivalent offillText
'sx
andy
, but in that case how come they can be optional?
The idea for the x
and y
arguments passed to fillTextCluster()
is that by rendering all the clusters returned by getTextClusters()
at a specific position, the rendered result is exactly the same as calling fillText()
at that same position. In other words, it works as a delta for the internal x
and y
attributes that are part of the TextCluster
object. If they are not passed, the values from TextCluster
are used directly (so the equivalent call would be ctx.fillText(text, 0, 0)
).
Thanks for your comments!
We think it can be useful to allow the creation of
TextClusters
directly from JS if desired.
I fail to see how this could work. As per your previous point, authors would also need to define all of the CanvasTextDrawingStyles
inside that object for the engine to make sense of it. You'll then enter the issue of how an author can point to an actual font. As a string? Then when is it parsed? It could be as a FontFace
but that would be quite novel since the 2D context doesn't accept this kind of object yet.
I might have missed it in the explainer, but I guess it would be clearer to me if you could share an example use-case for the ability to modify a TextCluster
, or to create one from scratch without going through measureText().getTextClusters()
.
(so the equivalent call would be
ctx.fillText(text, 0, 0)
).
I think that'd be the first such positioning argument that defaults to 0
in the whole API. That feels odd to me.
Sorry, I'm a bit confused. Dictionaries can't save references to other objects, so if that's indeed needed, then staying with the current interface design makes the most sense.
It seems this intends to cover some of the same use cases as #10650.
cc @whatwg/canvas @khushalsagar
Yes, the issues are related in that they came out of the same discussions and prior proposals for improving canvas text. The editing aspects of this proposal could be covered by just inserting HTML content and editing that, but the proposal here is simpler from both an implementation and author perspective.
The access to text cluster information is unique to this proposal and really to a canvas context where the author has direct control of placement of everything.
I fail to see how this could work. As per your previous point, authors would also need to define all of the
CanvasTextDrawingStyles
inside that object for the engine to make sense of it. You'll then enter the issue of how an author can point to an actual font. As a string? Then when is it parsed? It could be as aFontFace
but that would be quite novel since the 2D context doesn't accept this kind of object yet. I might have missed it in the explainer, but I guess it would be clearer to me if you could share an example use-case for the ability to modify aTextCluster
, or to create one from scratch without going throughmeasureText().getTextClusters()
.
For modification, the main use case we have thought of so far would be to actually draw the cluster at the position x, y
passed to fillTextCluster()
. This is useful if you want to animate your text as a rotating circle. It's possible to use the x
value from each cluster as a way to know where in the circle that cluster should start. But after that, it would be better to be able to call fillTextCluster()
with the actual final position for the cluster, and that requires making the x
and y
values stored in cluster 0
. Allowing the modification of these two values is (at least for now) our approach for this use case. The rest of the attributes of TextCluster
should be immutable though, I agree.
For the manual creation of TextCluster
objects, we thought of it as a potentially interesting feature but if it feels not useful or inconsistent I'm okay with not allowing authors to do that directly, and only supporting clusters originating from getTextClusters()
. Our original idea in that case was to use the CanvasTextDrawingStyles
from the rendering context if they aren't available from the TextCluster
, but now I realize that can even have inconsistencies with changes in the context state. So I agree on that too!
(so the equivalent call would be
ctx.fillText(text, 0, 0)
).I think that'd be the first such positioning argument that defaults to
0
in the whole API. That feels odd to me.
After discussing I realized I mixed this with another default. This default is just a remanent of how my first prototype was implemented. I agree the x
and y
in fillTextCluster()
shouldn't be optional. I will update the explainer accordingly.
What does TextCluster
really represent?
Is TextMetrics
the right place to clusterize the text? TextMetrics
is just metrics for a string. Making TextMetrics.getTextClusters()
return sequence<TextCluster>
where each TextCluster
has DOMString
is strange. TextMetrics
does not return the string it measures. It does not know the x
and y
this string will be displayed at. But the returned TextCluster
knows its x
and y
.
I think what you are trying to do is a job of TextAnalyzer
, TextItemizer
or TextClusterizer
more than a job of TextMetrics
.
The name of this method is a little bit confusing:
unsigned long caretPositionFromPoint(double offset);
First it does not return a CaretPosition
. It just returns an offset in a string. Second it does not take a point like Document.caretPositionFromPoint()
. It just takes a distance from the origin of display.
The name of this method is a little bit confusing:
unsigned long caretPositionFromPoint(double offset);
First it does not return a
CaretPosition
. It just returns an offset in a string. Second it does not take a point likeDocument.caretPositionFromPoint()
. It just takes a distance from the origin of display.
Yes, I agree there is some potential for confusion in the arguments differing from the DOM version, but there is some discoverability benefit to having the same name. Would caretLocationFromOffset()
be an improvement?
What problem are you trying to solve?
Selection and caret position are two building blocks for editing text in canvas content. Consider the sequence of dragging out a text selection with a mouse or touch, then copying and pasting into a new location. Determining which characters are part of the selection requires mapping a point onto a string, then to a caret position in the text. Drawing the selected region requires the selection area. Inserting again requires mapping a point into a location within a character string. It should be easy for authors to implement editing behavior in canvas.
In addition, we’ve seen increased demand for better text animation and control in canvas. Of particular concern are text strings where the mapping from character positions to rendered characters is complex or not known at the time of authoring due to font localization.
The use cases include:
What solutions exist today?
The existing TextMetrics APIs give an approximation of the bounding box for a string. This can be used in Javascript to implement the necessary functionality for editing, to a first approximation. Bounds are approximate, however. Furthermore, determining the caret position within a text string corresponding to a hit point requires binary search or similar over the set of strings. i.e am I in the left or right half of the string, recursively requiring log(n) TextMetrics construction and measurement calls. Each of these is relatively expensive.
There is currently way to know which characters in a string correspond to individual glyphs rendered to screen, short of incorporating complete BIDI and font glyph analysis into you app. Trying to lay out characters along a path, or apply per-glyph styling, impossible without knowledge of which characters combine to form which glyphs.
How would you solve it?
Please see the full explainer, including demos, at https://github.com/Igalia/explainers/blob/main/canvas-formatted-text/text-metrics-additions.md
We propose four new functions on the
TextMetrics
interface:In addition, a new method on
CanvasRenderingContext2D
supports filling grapheme clusters:The
caretPositionFromPoint
method returns the character offset for the character at the givenoffset
distance from the start position of the text run (accounting fortextAlign
andtextBaseline
) with offset always increasing left to right (so negative offsets are valid). Values to the left or right of the text bounds will return 0 ornum_characters
depending on the writing direction. The functionality is similar but not identical todocument.caretPositionFromPoint
. In particular, there is no need to return the element containing the caret and offsets beyond the boundaries of the string are acceptable.The other functions operate in character ranges and return bounding boxes relative to the text’s origin (i.e.,
textBaseline
/textAlign
is taken into account).getSelectionRects()
returns the set of rectangles that the UA would render as the selection background when a particular character range is selected.getActualBoundingBox()
returns the equivalent toTextMetric.actualBoundingBox
restricted to the given range. That is, the bounding rectangle for the drawing of that range. Notice that this can be (and usually is) different from the selection rect, as the latter is about the flow and advance of the text. A font that is particularly slanted or whose accents go beyond the flow of text will have a different paint bounding box. For example: if you select this: W you may see that the end of the W is outside the selection highlight, which would be covered by the paint (actual bounding box) area.getTextClusters()
provides the ability to render minimal grapheme clusters (in conjunction with a new method for the canvas rendering context, more on that later). That is, for the character range given as in input, it returns the minimal logical units of text, each of which can be rendered, along with their corresponding positional data. The position is calculated with the original anchor point for the text as reference, while thetext_align
andtext_baseline
parameters determine the desired alignment of each cluster.To render these clusters on the screen, a new method for the rendering context is proposed:
fillTextCluster()
. It renders the cluster with thetext_align
andtext_baseline
stored in the object, ignoring the values set in the context. Additionally, to guarantee that the rendered cluster is accurate with the measured text, the rest of the CanvasTextDrawingStyles must be applied as they were when ctx.measureText() was called, regardless of any changes in these values on the context since. Note that to guarantee that the shaping of each cluster is indeed the same as it was when measured, it's necessary to use the whole string as context when rendering each cluster.For
text_align
specifically, the position is calculated in regards of the advance of said grapheme cluster in the text. For example: if thetext_align
passed to the function iscenter
, for the letter T in the string Test, the position returned will be not exactly be in the middle of the T. This is because the advance is reduced by the kerning between the first two letters, making it less than the width of a T rendered on its own.Anything else?
A very minimalist editor built on this functionality is at https://blogs.igalia.com/schenney/html/editing-canvas-demo.html
See https://blogs.igalia.com/schenney/canvas-text-editing/ for details on which browser version and flags are required.