Consider allowing indirection to retrieve bounding box from different glyf glyph

googlefonts / colr-gradients-spec

63 stars 8 forks source link

Consider allowing indirection to retrieve bounding box from different glyf glyph #251

Closed drott closed 3 years ago

drott commented 3 years ago

In a conversation with @bungeman Ben points at a certain restriction when using a font as an outline font and COLRv1 font at the same time. I.e. displaying a COLRv1 font in an environment that does not understand COLRv1.

If the bounding box is always retrieved from the base glyph's entry in the glyf table, this fallback contour glyph would always need to be at least as wide as its COLRv1 rendering result (potentially widened by invisible points if needed).

Font makers would have more flexibility with the fallback glyf glyphs if we had a way to point to a dedicated bounding box glyph separately, other than always referring to the base glyph's glyf entry.

anthrotype commented 3 years ago

fallback contour glyph would always need to be at least as wide as its COLRv1 rendering result (potentially widened by invisible points if needed)

why would that be a problem? Why would one insist on wanting to have a tighter (than the color glyph) bounding box for the fallback glyph?

drott commented 3 years ago

One might say there's a trade-off between space-saving and "clean-ness" here where the placeholder contour points go - do they belong with a potential fallback glyf glyph even if the glyf glyph is different from what the font paints in COLRv1. Further clarification from the discussion with Ben: It's more important to have accurate (mostly sufficiently large) bounding boxes, particularly in the variable case, compare https://github.com/fonttools/fonttools/issues/2187

bungeman commented 3 years ago

Also, it isn't really correct to think of the 'glyf' outline as a 'fallback' but as a different presentation form. For example, using Unicode one can select between text and emoji presentation form (see https://unicode.org/reports/tr51/#Emoji_Variation_Sequences ). Inside the font there may be many unrelated visual representations for a glyph and it is up to the application to select one. These representations should not be tied together in some arbitrary manner. Indeed, it's a bit of a stretch that these different representations should share the same base advances, though these would be a bit more difficult to tweak.

anthrotype commented 3 years ago

so, IIUC, you are suggesting we add a mechanism that would allow a COLRv1 base glyph record to reference a different glyph than the one it's normally associated with, just for the purpose of retrieving that other glyph's bounding box?

Another option is we define this bounding box inside the COLR table itself, instead of grabbing it from a glyf glyph. It could be a new GlyphBoundingBox structure, inlined at the end of a BaseGlyphV1Record, or pointed to by an (optional) offset.

Lorp commented 3 years ago

@bungeman I believe b/w fallback has always been a high priority in COLR, and will likely remain so. Metrics must not change depending on a device’s colour capability, and various presentation forms should not all be forced to have identical metrics, should they?

Fake bboxes in 'glyf' (union of b/w and COLR bboxes) and fake points in 'glyf' (as @drott suggests above) seem like ugly hacks, compared with COLR glyphs storing their own accurate bounding boxes as proposed by @anthrotype. Composite glyf glyphs store their own bboxes, and so should the “composite” glyphs of COLR if different from 'glyf'. The rule "If COLR bbox is not present, use 'glyf' bbox" seems workable. In practice I guess that glyf and COLR glyphs will very often have identical bboxes so there need not be a significant extra data.

vlevantovsky commented 3 years ago

For example, using Unicode one can select between text and emoji presentation form (see https://unicode.org/reports/tr51/#Emoji_Variation_Sequences ). Inside the font there may be many unrelated visual representations for a glyph and it is up to the application to select one.

Not sure if I understand this correctly, I think what you're saying is that inside the font there may be many different glyphs that are mapped to the same character codepoint, and they may be selected using variation sequences. In this case, each glyph would offer a different visual representation for a codepoint, but they are still unique, separated glyphs.

Also, I am not sure I understand what the is the problem we'd be trying to solve if we allowed using different glyphs for bounding box definitions. There is no requirement for a glyph bounding box to match its outline boundaries - many times they do match, but the bounding box can be defined independently of the glyph outline itself. (For example, when WOFF2 applies glyph preprocessing, it has a flag that indicates whether the bbox of a glyph matches the extrema points of a glyph outline. If it does, the bbox data can be dropped and recalculated during the woff2 decompression; if it doesn't - the bbox data is encoded explicitly in a separate data stream.)

So, if the design of a particular COLR glyph needs a bounding box that is larger than the fallback b/w/ glyph itself, the bounding box of a fallback glyph can be modified to accommodate this - this is purely a design decision. Why would we need a separate glyph just to change the bounding box?

PeterConstable commented 3 years ago

it isn't really correct to think of the 'glyf' outline as a 'fallback' but as a different presentation form...

That's valid. However...

Metrics must not change depending on a device’s colour capability...

There is validity to this as well, and that I think has greater import: if the metrics are different depending on whether the presentation is b/w or COLR, then which metrics are used must be known much earlier in processing: before GPOS processing, line breaking, or any other layout processing that depends on metrics. That would complicate implementations, and seems like it would be more costly in processing. In the design of COLR v0, it seems that use of the base glyph to determine metrics was done precisely to avoid that complication.

drott commented 3 years ago

@Lorp wrote:

In practice I guess that glyf and COLR glyphs will very often have identical bboxes so there need not be a significant extra data.

Ben points out that there is already relationship between the glyf representation of a glyph and its COLR v0 or COLR v1 representation, as they do share the same advance.

Speculatively, in many cases in a font that is aimed at compatibility for a non COLR v1 aware renderer, we would probably expect the non-color contour to somewhat mirror the look of the COLR v1 appearance and follow similar space requirements. Of course, that does not have to be the case and an entirely different representation in glyf is possible.

It is a design decision (and I am saying that explicitly without a measure of ugliness/cleanliness here as I see pros and cons to both). We can place them as extra points in the contour table entry (glyf, CFF, CFF2) of the base glyph for a COLRv1 base glyph. No extra points are needed if there is a contour at the base glyph slot and that one has the same extents as the COLR v1 glyph.

Producing fonts which choose to draw something entirely different in glyf or CFF or CFF2 than in COLR v1 (is that likely?) means that tools need to insert sufficiently widely placed contour points into outline contour tables to cover COLR v1 extents.

In response to @vlevantovsky 's comment:

Also, I am not sure I understand what the is the problem we'd be trying to solve if we allowed using different glyphs for bounding box definitions.

I think the main consideration is: Is the base glyph contour the right place to store bounding box information for a COLR v1 glyph that does draw something not necessarily related to what it's in the base glyph contour definition. And in that case - is it still the right place?

@vlevantovsky The TrueType glyph bounding box considerations, i.e. bounding box not necessarily being equal to cbox of points, apply for glyf table entries - but we do not use those for the definition of what a COLR v1 drawing surface bounding box is.

Lorp commented 3 years ago

Indeed the matching metrics between COLR and glyf are not just advance width, but all metrics (except those that relate to the bbox, i.e. LSB, RSB), including kerning and layout, so it makes sense not to allow COLR glyphs to specify advance at all.

This is actually fine, as glyf does not specify advance either – both COLR and glyf externalize their text flow metrics in exactly the same place: advance width from hmtx and layout from GPOS, indexed by the glyphId shared by COLR and glyf.

Regarding adding extra points:

It’s always been a de facto rule but the spec needs to be explicit that isolated points are to be ignored for rendering, see https://github.com/MicrosoftDocs/typography-issues/issues/720
It would be nice if COLR fonts can exist without necessarily having visible glyphs in the glyf slots of the b/w fallback glyphs (of couse all the “components” used still need to be defined in glyf)
Why is it a bad thing if the bbox in a glyf glyph is larger than necessary for its glyf data?
How important is bbox data?

drott commented 3 years ago

It would be nice if COLR fonts can exist without necessarily having visible glyphs in the glyf slots of the b/w fallback glyphs (of couse all the “components” used still need to be defined in glyf)

That's what the current fonts we produce in https://github.com/googlefonts/color-fonts are doing - they only have degenerate two or four point contours in glyf, which are not visible. Is that what you mean?

How important is bbox data?

For COLRv1 it's important because running the graph drawing algorithm is a slow way to compute what bounding box is needed to draw the glyph. It's also a bit of a chicken and egg problem: The graphics library needs to know upfront what size of backing storage to allocate to be able to draw the glyph accurately. To determine the bounding box correctly using the graphics library, one would need to reserve a potentially over allocated area, execute the graph drawing, then determine the bounding box (or do it with a different implementation that does not actually draw but is only made for bounding box analytics).

rsheeter commented 3 years ago

Marking v1 because we need to resolve in v1. My instinct is to close, no spec change required, current behavior is OK.

PeterConstable commented 3 years ago

it isn't really correct to think of the 'glyf' outline as a 'fallback' but as a different presentation form

I think we need to think of the 'glyf' and 'colr' glyphs as different presentation forms of the same abstract glyph and having the same metrics. (See my earlier comments about layout, which happens earlier, needing to know the metrics.) If a font designer wants to provide two significantly different presentations that have different metrics, they can do that now using OTL features in the GSUB table.

My instinct is to close, no spec change required, current behavior is OK.

rsheeter commented 3 years ago

@bungeman do you think this is ok as is?

bungeman commented 3 years ago

First note that my objection is entirely from the standpoint of bounds. Advances being the same is a completely separate issue which I don't really have a strong implementation opinion on. Most of the following is explaining that glyf and COLRv1 (and COLRv0, CFF, SVG, sbix, EBDT, any future COLRv2, etc) presentations are drawing different things and so may have different bounds and the bounds should go with the presentation encoding (and not bleed over into a different presentation form's encoding). These other presentation forms are already the "same abstract glyph" (they have the same glyph id and same advances) but may not have the same bounds.

First objection is that until there exists an actual implementation that can compute the cached bounds for COLRv1 in sane way (including all variation additions), the idea of having cached bounds anywhere is suspect and should not be referred to by the specification. First write the sufficiently smart compiler. The first line of the pseudocode "Allocate a bitmap for the glyph according to extents of base glyph contours for gid" asks for something which does not yet exist and hasn't been shown to be possible (especially with variable rotation, though in theory this can be handled with HOI). Even if possible it will be complicated enough that an explicit algorithm should be specified for computing the values and associated variation tuples.

Second objection is that while it would be nice to have the glyf and COLR match up, if the actual bounding boxes are different (and it seems really easy for them to be different, like just adding an outline around the color to prevent clashing) forcing the base glyf to have information only useful to the COLRv1 (extra points which may inflate the bbox) is quite awkward. It is of course possible to work around this for glyf by making sure not to include single points in the bounding box (and arguably this should already be done, both in the glyf box and any computed box) but it's just another bit of trivia that isn't obvious. Especially since it seems that one would need to know to exclude single points for the glyf bounding box but leave them in for COLRv1 and sbix bounding boxes (since sbix already relies on single points in glyf). Also, using the glyf bounds this way (instead of referencing two or more points from the COLRv1 Version-1 additions) is that now they're kinda used up (and already overloaded with respect to glyf anyway) and can't be used with some future COLRv2 since that may introduce another presentation with yet different bounds. In other words, using the glyf bounds in relation to the over all COLRv1 bounds sets off my "high coupling, low cohesion" alarms.

I will say that "they can do that now using OTL features in the GSUB table" made me think of a COLR feature which one could use to state that there was intent to use specialized COLR presentation instead of glyf and compatible COLR presentation. Though I think maybe this was referring to using variation selectors to substitute in a different underlying glyphid, though that really doesn't address the mechanics of actual rendering.

Backing up, the reason we want fairly quick metrics including bounding box is for shaping, line breaking, rejection, layering, and damage. Shaping occasionally needs some sort of bounds when guessing at placement points and the like, but often does not need all of them (and in a well written font may not ever really need them). Sometimes bounds are considered in line breaking, though many systems these days do not and allow arbitrary overflow (partially due to the cost of computing the boxes). Both shaping and line breaking need tight bounds when using bounds. Rejection really needs bounds though, since often an entire long document will need to be laid out and drawn and most glyphs will be clipped out, and the faster this can be done the better. Rejection works best with tighter bounds but can also benefit from very inexpensive loose bounds (like "this is the biggest any glyph can be"). Also, bounds tight enough to detect a lack of overlap can be greatly beneficial, allowing for out-of-order glyph batching when drawing. Layering refers to knowing how big of an off-screen to use when compositing a multi-layer glyph before it is composited onto the existing surface, which ideally has fairly tight bounds. Damage can be computed while drawing (can't damage something that hasn't been drawn once yet) so isn't a big deal, but it is important to get it right (e.g. don't use the raw glyf bounds when drawing sbix).

The biggest issues (to me) are rejection and layering. The reason for having a pre-optimized cached bounds function is to get good enough (but never too small) numbers for these operations. Obtaining these numbers should be faster than doing a dry-run drawing (computing everything up to, but not including, rasterization). I don't mind even quite loose bounds for rejection but layering really benefits from tight bounds. On the other hand when layering we've already done rejection, so a dry-run to get bounds isn't too bad. All this to say that generally it seems really fast conservative bounds usually speed things up more than slower tight bounds, but both have a purpose.

PeterConstable commented 3 years ago

First note that my objection is entirely from the standpoint of bounds...

Helpful, thanks. As pointed out, the issue raised is bigger than COLRv1 since bounds for SVG, sbix etc. color glyphs can be different from the glyf glyph.

and can't be used with some future COLRv2 since that may introduce another presentation

While v2 is anticipated, it would add capabilities but would not allow for another presentation: a v0 and v1/v2 presentation could co-exist, but not separate v1 and v2 presentations.

they can do that now using OTL features in the GSUB table...

I meant substitution of a different abstract glyph, e.g. a swash form, that could have very different advance and bounds.

Backing up, the reason we want fairly quick metrics including bounding box is for shaping, line breaking, rejection, layering, and damage.

Not clear to me what you mean by rejection or by damage.

Shaping occasionally needs some sort of bounds when guessing at placement points

GSUB and GPOS lookup processing doesn't use bbox information. But layout operations like justification or evaluation of overhang at line ends (which might well trigger application of GSUB or GPOS features) certainly might.

But should shaping and layout need to know whether presentation will be done using glyf or sbix or SVG or COLRv0 or COLRv1 data? I suspect a lot of implementations currently assume No.

raphlinus commented 3 years ago

I'm trying to get my head around this issue so I can (hopefully) offer helpful analysis. I think there are two questions going on. The first is how bboxes are computed and used for variable fonts in general (not specific to COLR), and the second is basically the relation between the bboxes when the non-color and color presentations differ.

I looked at a few implementations and did not find evidence that bounding boxes encoded in the glyf table are used for the sorts of things that @bungeman cites - preallocating a buffer for rasterization or doing damage calculations. Both FreeType and a Rust font parser (ttf-parser) use the glyf's bbox in the non-vf case, but fall back to traversing the outline in the variable case. It's entirely possible there are other codebases that do, and that's part of why I'm writing this comment, to see if anybody has information on that.

It's not clear to me that the glyf's bbox is usable for these purposes. The left phantom point (reflecting lsb) could track the inked bounding box of the glyph, but unless I'm missing something, the right phantom point is committed to adjusting the advance width of the glyph. The relation of advance width to bounding box (rsb) is not guaranteed, so I don't see how you can infer a reliable bounding box.

There's a bit of other discussion of phantom points in https://typedrawers.com/discussion/2295/otvar-exact-definition-of-bounding-box-in-a-variable-font , without a clear resolution.

There are other things you could do here, including calculating an enclosing bbox over the entire parameter space (so the phantom points don't come into play), but that is very unsatisfying when there's a width axis with wide range (which happens for example with Inconsolata).

I have other thoughts on the second half (color presentation vs non-color) but will wait to see if I get a response on these questions before trying to synthesize that.

PeterConstable commented 3 years ago

It's not clear to me that the glyf's bbox is usable for these purposes. The left phantom point (reflecting lsb) could track the inked bounding box of the glyph, but unless I'm missing something, the right phantom point is committed to adjusting the advance width of the glyph. The relation of advance width to bounding box (rsb) is not guaranteed, so I don't see how you can infer a reliable bounding box.

But are the phantom points even relevant at all? By definition, the phantom points are generated by the rasterizer as it processes glyph data but do not exist in the glyph data itself. As currently spec'd, the bbox used for COLR is obtained directly from the glyph data, not from the rasterizer's processing of the glyph data. (See #256.)

I had forgotten about the issue Renzhi raised in that TypeDrawers thread, but IIUC his concern is about computing TSB for vertical layout. I'm not sure it's relevant to this topic.

PeterConstable commented 3 years ago

I'm looking at @rsheeter 's PR #321. I still haven't seen a good explanation for why bbox data separate from 'glyf' is necessary. I do think @bungeman has made the point that it's conceptually cleaner, and in principle better not to overload 'glyf' entries given that there might be alternate color glyph descriptions (sbix, SVG, COLR) each having slightly different bboxes. But since the primary need for bbox info for color glyphs is to allocate memory, then if a font has different color glyph descriptions it's not clear why points added to a glyf entry couldn't be the extrema across those different descriptions.

Since this is only about bboxes and not advances, I don't have a strong objection to Rod's proposed addition. I'd only suggest that we should allow for fonts to use 'glyf' data for bbox info if that is sufficient (e.g., a non-variable font with "fallback" b/w glyphs that have bboxes that satisfy the colour glyph requirements) since that can save 12 bytes per colour glyph.

Note that this does not provide an issue solution for the issue raised by Ben regarding bboxes impacted by variable rotation. If a variable glyph has a rotating component, there may be some portions of the variation space in which that component determines (in part) the bbox, but other portions of the variation space in which it does not. And for those portions in which it is relevant, current HOI would be the best way to closely approximate the effects of the rotation. (Maybe in the future we can enhance variations with other basis functions.)

PeterConstable commented 3 years ago

OK, I do see now there's a bit more of an issue for variable fonts. I was focusing on the rotation aspect, and that's not easily solvable (there's no easy way apart from HOI to describe a varying bbox if the extrema are determined by components that can rotate). But @raphlinus pointed out,

Both FreeType and a Rust font parser (ttf-parser) use the glyf's bbox in the non-vf case, but fall back to traversing the outline in the variable case.

That's required because the 'gvar' table doesn't provide deltas to adjust the cached bbox values (xMin etc. in the glyph entry). Evidently for purposes of the 'glyf' table and TT rasterizer, it wasn't important to provide deltas for these cached values, but sufficient to compute the instance outline then find the extrema.

But for COLR, invoking the TT (or CFF) rasterizer to get back extreme in order to allocate memory for the bitmap would be problematic. So, that means that, if the xMin etc. values are to be used, they would need to be provide the extrema for all instances of the outline across the font's entire variation space. I'm starting to understand now why an alternate approach might be good.

behdad commented 3 years ago

But for COLR, invoking the TT (or CFF) rasterizer to get back extreme in order to allocate memory for the bitmap would be problematic.

Why?

behdad commented 3 years ago

Copying my comments from #321 here:

I think merging this was premature.

What happens when the paint tries to paint outside the bbox? I don't see that specified. And if not specified, we are bound to see fonts that have bogus bounding box info.

What bothers me most is that this is not even optional. The byte cost cannot be avoided by an implementation that wouldn't need it.

What should have been done instead was to address Ben's concerns by specing out how to calculate bounding box on the fly. This is a very well-known problem. You simply have to run the paint graph / draw twice: first to calculate the bounds so you can allocate, second to do the drawing.

What has happened here is typical of when implementors and spec writers are too close. Things go into the spec without being fully justified. Variations, indeed, as have been pointed out multiple times, make it impossible to even specify bounding boxes.

Even before PaintRotate, the tight bounding box of an OpenType 1.8 Variable Font was not expressible as a variable value using the same variable model. It was possible to specify a non-tight box easily-enough though.

With PaintRotate, even that is not doable. So, to address a concern about rasterization time, one that is well-understood and easily solvable, you have now encoded requirements for the compiler, that we have no idea how to implement.

Indeed, it was fine closing #289 because an implementation did NOT have to use the encoded bbox. But now we are explicitly forcing compilers to compute that, something we don't have a known way to do, short of walking every glyph outline point and calculating it's max bounds under the PaintGraph.

In short, you've replaced a tractable problem with an intractable one.

The same thing happened at OTVar1.8 time: MS insisted that bbox variations must be encoded explicitly and forced it through in HVAR/VVAR tables despite objections; only later it was realized that those cannot be encoded given the existing variation model. It's still hanging over our head. Why is that issue being ignored, even in the face of the more problematic PaintRotate?

behdad commented 3 years ago

Furthermore, CFF doesn't specify bounding boxes and is variable, yet no one said it can't be implemented. SVG has no bounding boxes and has rotations I suppose, no one said it can't be implemented.

The problem Ben / Dominik bring up is justified. The correct solution, IMO, is to say COLR glyphs have no specified bounding box, and then spec how the bounding box is to be calculated. Problem solved.

Spec'ing bytes we don't even understand how to produce, and we know are not mathematically producible, is unacceptable to me.

rsheeter commented 3 years ago

Reopening. Previously we had punted computation of bbox to compiler and I thought we were OK with that. Evidently not, we should talk more :)

IF we want a precomputed bbox I do think having it live with COLR is correct. I was under the impression there was a performance concern with having to calculate the bbox at runtime, leading to the original language (use base glyphs bbox). @drott is that a figment of my imagination?

Thinking "aloud", I suppose another option is to offer only a non-variable bbox which is the precomputed bbox for the default location in designspace and for any other location you have to compute it. That feels ugly but might be useful for static COLR fonts if the runtime computation is expensive.

PeterConstable commented 3 years ago

But for COLR, invoking the TT (or CFF) rasterizer to get back extreme in order to allocate memory for the bitmap would be problematic.

Why?

Ok, I was assuming (perhaps wrongly) it would be a concern for (some) implementations. When rasterizing b/w glyphs, the (TT or CFF2) rasterizer is already involved, and internally it can do what it needs to allocate memory or whatever bbox info is needed for. For COLR (or for SVG), in a non-variable case, the rendering implementation might not need to invoke the rasterizer (unless it is doing that to get the clipping region from a PaintGlyph as a bitmap). So, for a variable font, should the implementation be forced to invoke the rasterizer to compute a bbox?

Or, is the suggestion that the glyf entry contain points for the default instance that provide bbox extrema across the entire design space?

What happens when the paint tries to paint outside the bbox?

That's discussed in 5.7.11.1.8.2: it implies no guarantees and that the color glyph could be clipped.

What bothers me most is that this is not even optional.

I had suggested that the bbox should be optional, allowing it to be derived from the base glyph. I didn't have a strong argument to make the bbox required, though.

rsheeter commented 3 years ago

If it's not a performance issue we can simply drop bbox. If it IS a performance issue we have a problem; COLR v1 does need to be fast enough to replace alternatives.

I think due to COLR being it's own presentation we are correct to not say "use base glyph." If a bbox is to be given it shoudl live on COLR. I think that leaves a couple of options:

No bbox, it's not a perf issue, just delete the damn thing
bbox only for default location in designspace, have to compute under variation
- Delete VarBBox and add a note about bbox being for default location only
- You could continue and say you can give precomputed bboxes for arbitrary locations in designspace, one could easily imagine precomputing for a constellation of positions (hundred-weights, etc)
Variable bbox is allowed but optional, if it's not there you have to compute under variation
- Let the compiler decide if it thinks it has a good enough solution
Required variable bbox is fine
- For context, my thought was that while a general purpose correct variable bbox may indeed be infeasible a heuristic sufficient for most typical font uses - notoably it is typically not the case that a small shift in designspace dramatically alters the bbox - is achievable.

Unfortunately I don't have data at hand to say whether lack of precomputed bbox is a significant performance issue or not. On our side we'll have to chat with @drott next week.

PeterConstable commented 3 years ago

The same thing happened at OTVar1.8 time...

Are you referring to the mappings for lsb and rsb variations?

(I don't think MS was alone in wanting HVAR for advance, lsb and rsb, but I don't recall now for certain. In any case, this issue should be evaluated on its own merits.)

PeterConstable commented 3 years ago

bbox only for default location in designspace, have to compute under variation

I'm not sure what benefit that would add. I know that @bungeman was suggesting that COLR vs. glyf (or CFF(2)) are, in principle different presentations, so could in principle have different extrema. But that issue has existed before now for COLR v0, SVG, sbix and CBDT. The only thing that's new for COLR v1 is the need for a 2D graphics implementation to allocated memory to the drawing surface when the COLR v1 description doesn't provide that (unless we add it now).

SVG is similar to COLR v1 in that a 2D graphics implementation needs to allocate a drawing surface up front, and is expected to not clip. Per the spec, the em square is the default viewport. From the OT-SVG spec:

An SVG glyph description in the SVG table is an alternate to the corresponding glyph description with the same glyph ID in the 'glyf', 'CFF ' or CFF2 table. The SVG glyph description must provide a depiction of the same abstract glyph as the corresponding TrueType/CFF glyph description. ...

Glyph advance widths or heights are the same for SVG glyphs as for TrueType/CFF glyphs, though there may be small differences in glyph ink bounding boxes. Because advances are the same, switching between SVG and non-SVG rendering should not require re-layout of lines unless the line layout depends on bounding boxes. ...

The size of the initial viewport for the SVG document is the em square: height and width both equal to head.unitsPerEm. If a viewBox attribute is specified on the

Although the initial viewport size is the em square, the viewport must not be clipped. ...

As with CFF glyphs, no explicit glyph bounding boxes are recorded. ... The “ink” bounding box of the rendered SVG glyph should be used if a bounding box is desired; this box may be different for animated versus static renderings of the glyph.

The issue with animations for SVG is analogous to variations for COLR v1: there is a viewport, something can affect the position of visual elements and, potentially, move elements outside the viewport, but the implementation must not clip. (I don't know if there's any OT-SVG implementation that supports animation, though.) So, an implementation can start with the em square size for a default drawing surface, and probably that will work for most SVG glyphs, but then it can take other steps if, in the course of parsing the SVG, it discovers the colour glyph requires a larger surface.

It seems to me option 3 might be a reasonable compromise, but I don't have any data to inform that, and I'm also not working on compiler or runtime implementations to inform that.

drott commented 3 years ago

To answer @PeterConstable's earlier question:

Not clear to me what you mean by rejection or by damage.

IIUC, by rejection Ben means a graphics library's first analysis pass of which part of an image (as a sequence of drawing commands) actually need to be drawn and are visible and affecting the current viewport. Other parts of the image/drawing command sequence can be rejected.

Damage, I believe, similarly means: For redrawing (due to "damage" i.e. overpainting or otherwise dirty'd rectangle) part of an image / sequence of drawing commands: Which of them affect the damaged region, which of them do need to be re-executed.

IF we want a precomputed bbox I do think having it live with COLR is correct. I was under the impression there was a performance concern with having to calculate the bbox at runtime, leading to the original language (use base glyphs bbox).

In my benchmarking of COLRv1 vs SVG as posted on webkit-dev, I measured drawing of 1369 emoji glyphs from COLRv1 or SVG, on two different machines. The mean time for the COLRv1 drawing broken down to per glyph on this benchmark amounts to 0.83 to 2.11ms depending on machine type, but both machines' processor were rather high end.

In Skia's implementation, @bungeman correct me if I am wrong, there is a first stage when setting up the glyph cache to get metrics, including the bounding box for the used glyphs. From this pass of retrieving metrics, the bitmap mask is allocated, which is then used in a second stage to generate the actual pixels.

If the first stage of metrics retrieval had to do a duplicate run of the graph traversal, this would have a performance impact on the costs per glyph, definitely compared to retrieving a static number from somewhere (glyf or even (Var)BBox), and while not full rasterisation may be needed (color depth could be reduced, or resolution could be reduced, other optimisations), it may be too much overhead to implement such a "fast dry run mode".

So I tend to think: If due to the design of the font, a bounding box can be correctly computed or declared up front and included in the font, it is a worthwhile optimisation. If the glyph contains Paint* operations that make it impossible to express the bounding box dimensions correctly in a variable bounding box, there should be an option to force the implementation to compute it based on a dry run.

Of course with both options, we risk that one of the code paths gets less tested or that fonts may have incorrect bounding boxes if the information ends up not being used by a popular implementation.

rsheeter commented 3 years ago

Re: testing we can mitigate the risk somewhat by providing test fonts with and without bboxes.

rsheeter commented 3 years ago

Per discussion w/@behdad and @drott suggested update which I'm happy to draft:

Make a separate datastream, e.g. pointer to an array not per-paint pointers
In each entry, assign a range of glyphs and the box
Call it clip box to indicate that it clips and that it doesn't have to be tight (though that may be desirable)
Make clip box optional; if it's missing you have to walk the graph and compute the box with variations applied

Note from chat with @raphlinus : need to define how variation is applied, e.g. round "out" (bigger boxward)

Note from IM w @behdad: If a clip box is provided the graph is bounded, need to amend boundedness rules

apodtele commented 3 years ago

As a hint for memory allocation, ClipBox might be Ok. Any kind of imposed clipping will result in significant associated rendering costs, especially if the clipping region is transformed. So I would rather not use ClipBox at all. Walking through the layers is faster than clipped rendering.

drott commented 3 years ago

@apodtele: To recap some of the discussion: In rendering color glyphs, most graphics libraries will need to know an allocation size at the beginning. In COLRv0, this allocation size is the union of glyf layers (at least in the Skia implementation), IIRC the spec says the top level contour/layer is the allocation size. Some implementations even use information from the glyf table for bitmap font allocation sizes upfront.

In COLRv1, we initially opted for using the bounding box around the glyf table entry for the given glyph id as the allocation size. Ben pointed out that that approach is semantically questionable as the glyf entry in a hybrid font that has COLRv1 and glyf glyphs should be decoupled from the representation in COLRv1 to allow for those glyphs to look different.

Additional observations influencing our choice for introducing ClipBox: a) Emoji fonts, which will likely be an important use case for COLRv1, usually have equally sized glyphs. It would be wasteful to traverse the potentially complex COLRv1 graph for each glyph of an emoji font if it is known upfront that a square sized box is needed. b) When mixing in variations, and using primitives like PaintRotate with variable rotation angles, a bounding box cannot be effectively computed in advance, even as a variable bounding box - as the rotation angle causes non linear changes to the bounding box.

That's why we decided to a) decouple the bounding box for COLRv1 glyphs from representation in glyf or CFF2 and b) Make it optional so that in complex variable rotation scenarios, the bounding box is computed by the implementation.

This is how we arrive, all things considered, at putting in an extra ClipList / ClipBox option into COLRv1 to accelerate allocation without graph traversal as an optional tool.

Regarding your concerns regarding costs of downstream carrying of clipping and transforms: This is in any case an important part of the powerful primitives we make available with CORLv1: The PaintGlyph primitive can be implemented as path-based clip for downstream drawing operations: combining PaintTransform with a PaintGlyph below describes the scenario you're pointing out. PaintGlyph can be followed by a PaintSolid, Paint(Linear|Radial|Sweep)Gradient in which case it's a simpler path fill. But PaintGlyph can be followed by arbitrary paints such as another PaintGlyph or PaintColorGlyph in which it means: Clip out by path everything that's drawn below.

This is intentional and at the heart of COLRv1: Allowing powerful graphics primitives for an efficient and compact color font format.