w3c / csswg-drafts

CSS Working Group Editor Drafts
https://drafts.csswg.org/
Other
4.38k stars 643 forks source link

[css-inline-3] Ascent and Descent Metrics in vertical flow #5381

Open kojiishi opened 3 years ago

kojiishi commented 3 years ago

A follow up for #5312.

Currently IIUC Ascent and Descent Metrics is used only for Canvas TextMetrics and vertical flow is not supported in Canvas, but raising this because we might need it to define in CSS and for Font Metrics API.

When text-orientation: mixed, glyphs may be rendered upright or rotated sideways. Should Ascent and Descent Metrics use horizontal metrics for rotated sideways glyphs, or vertical metrics for upright glyphs?

/cc @fantasai @litherum @jfkthame @bfgeek @macnmm

fantasai commented 3 years ago

Currently IIUC Ascent and Descent Metrics is used only for Canvas TextMetrics

No, they are used in CSS layout also, e.g. for line height calculation. em-top/em-bottom are only used for Canvas, though.

When text-orientation: mixed, glyphs may be rendered upright or rotated sideways. Should Ascent and Descent Metrics use horizontal metrics for rotated sideways glyphs, or vertical metrics for upright glyphs?

@kojiishi oh, that's tricky question. In general we use the appropriate metrics per run based on orientation. But you're right that for getting the metrics of a box, we have to decide whether mixed should be considered upright or sideways.

I think I'm leaning towards using vertical metrics, unless they're missing, then use rotated horizontal metrics. I think that will work best for CJK, but I'm unsure about Mongolian--I think internally it's treated as rotated even though such characters are "upright" for the reader? That used to be the case in the past, I think; I'm unsure about current practices.

macnmm commented 3 years ago

In InDesign, there is the concept of the line's metrics (always single-direction), dictated to be either Latin ascent/descent or ideographic embox according to paragraph attributes (e.g. CJK composer = embox, other composers = ascent/descent), of the largest run. Note that this applies regardless of the script(s) in the line -- Using ideographic embox metrics to set text can apply to non-Japanese, for example; you are just setting the non-Japanese in the most Japanese-harmonious way (e.g. by computing ideographic embox metrics for non-japanese fonts, placing glyphs to balance well with ideographs) and following Japanese convention (e.g. for full-justification, etc).

TCY has internal metrics during composition but those are limited (e.g. different baseline alignments in the horizontal direction are not supported), and the TCY is in the end treated as a single CJK vertical upright glyph with a height and width of 1 em (of the tallest TCY character), on a center vertical baseline, that can bleed outside its typographic boundaries if longer than a few characters (so as not to disturb the line height or leading rhythm).

Mixed script text introduces further complexity, but I still feel strongly that the user needs a way to set paragraphs to use either a ideographic embox metrics model or a Latin ascent/descent metrics model as the two cannot be mixed without compromise of behavior and adherence to traditional practice (Japanese or Latin) when dealing with edge cases.

We are working on improving our Korean typography now, having met several influential type layout experts who are demonstrating how Korean layout is evolving away from Japanese-centric traditions of the past, informed by Latin typography and growing into an interesting and unique native hybrid.

lianghai commented 3 years ago

Mongolian is implemented in current mainstream fonts as a LTR horizontal script, and the glyphs expect 90-degree clockwise rotation in vertical lines. This is the result of that Mongolian script’s de facto fallback form in horizontal lines is expected to be 90-degree counterclockwise rotated from its normal vertical form, which is different from CJK’s expected behavior.

macnmm commented 3 years ago

Mongolian is implemented in current mainstream fonts as a LTR horizontal script, and the glyphs expect 90-degree clockwise rotation in vertical lines. This is the result of that Mongolian script’s de facto fallback form in horizontal lines is expected to be 90-degree counterclockwise rotated from its normal vertical form, which is different from CJK’s expected behavior.

This would imply a glyph origin of left side and Roman baseline. Is there a preferred way to calculate the baseline offset from origin in vertical layout? The CJK baseline for Japanese vertical being embox center means layout engines calculate the center baseline as an offset from the glyph origin in order to properly position glyph runs of different sizes in vertical. What are the conventions for Mongolian?

lianghai commented 3 years ago

Is there a preferred way to calculate the baseline offset from origin in vertical layout? The CJK baseline for Japanese vertical being embox center means layout engines calculate the center baseline as an offset from the glyph origin in order to properly position glyph runs of different sizes in vertical. What are the conventions for Mongolian?

There isn’t a non-heuristic way. The whole idea of “baseline for aligning glyphs of different font sizes” is just a rather marginal case for any script, which generally ends with arbitrary decisions to fulfill the layout architecture’s need. In theory, Mongolian vertical runs would like to be center aligned as well, similar to how CJK ones do, but as Mongolian glyphs are not horizontally symmetrical in vertical lines like how CJK ones do, there isn’t a straightforward way to calculate the alignment line.

Mongolian’s connected stem (that continuous vertical stroke in vertical lines, which may be used to calculate an alignment line in some way) is placed away from the Roman baseline, according to the need of visually aligning a whole glyph (not the connected stem) with both Roman and CJK glyphs. And it’s a pain to make sure Mongolian glyphs align with CJK glyphs in vertical lines, because there isn’t a standard for layout engines to place the Roman baseline in relation to CJK em-boxes, which further contributes to the lack of convention for where Mongolian’s connected stem goes in a glyph.

Btw, I find the “baseline” term confusing and very awkward when it’s used to refer to various kinds of alignment line in a layout architecture context. There’s really nothing of the so called “CJK baseline” comparable with the Roman baseline, besides both being considered the default alignment line. And the so called “hanging baseline” for Indic scripts is just a bizarre idea, when those scripts already have a typographically baseline highly comparable to the Roman baseline, while the headstroke only suggests a not-necessarily-stronger preference for alignment.