w3c / svgwg

SVG Working Group specifications
Other
711 stars 133 forks source link

How to lay out ruby-annotations in SVG 2? #870

Open therahedwig opened 2 years ago

therahedwig commented 2 years ago

Hi, we (Krita project) have been looking at the SVG 2 text spec, and there was one thing we were wondering about: Ruby annotations.

So, as quick summary: Ruby annotations consist of small script positioned above CJK characters containing the pronunciation of said character. It should not be confused with the programming language. There's some support for it in HTML, and a w3c spec that explains it far better than I could.

In theory, we could use the dy and dx positioning of the non-autowrapped text to position text in any way we want. However, a note in the spec says that these are to be ignored for auto-wrapped/flowed text. Just positioning by itself is also not super-ideal, as it prevents outsiders from understanding that this positioning is happening because it is a Ruby annotation.

There is also an OpenType way of doing it. My problem with this is that it is per-font, and then if certain pronunciations are missing, you'd have to tell someone 'go edit the font', which, licensing issues aside, seems really mean. Like, this project seems to be adding Bopomofo Ruby to fonts with some programming (and they're literally annotating every character). And they're not even using the ruby feature tag, but rather stylistic sets and unicode variation selectors, meaning that the problem is a little bit more complicated than just hardcoding a given annotation for a given character.

So, I've been wondering if someone has an idea of how to approach this?

More links

Tavmjong commented 2 years ago

The problem with using 'dx' and 'dy' in flowed text is that it moves characters (actually clusters) after they have been positioned by flowing into a shape. The ruby text could be moved with 'dx' and 'dy' above other characters but that would leave holes.

therahedwig commented 2 years ago

The problem with using 'dx' and 'dy' in flowed text is that it moves characters (actually clusters) after they have been positioned by flowing into a shape. The ruby text could be moved with 'dx' and 'dy' above other characters but that would leave holes.

Oh, wow, I hadn't even realized that... So we need a completely different solution?

therahedwig commented 2 years ago

Reading through the ruby css spec, it seems that there might be a solution in using specific values for display on a text-span.

For document languages (such as XML applications) that do not have pre-defined ruby elements, authors must map document language elements to ruby elements; this is done with the display property.

This could look like...

<text>
  Here is some <tspan style="display:ruby"><tspan style="display:ruby-base">text</tspan><tspan style="display:ruby-text">annotation</tspan></tspan> with annotation.
</text>

The only thing I am missing here is a way to define <rp> elements, that is, the fallback parentheses for ruby-annotations in html. So the above example "Here is some text(annotation) with annotation", would ideally be handled as "Here is some text (annotation) with annotation." when the implementation doesn't understand the ruby display values. We might need to raise this with the authors of CSS Ruby Annotation Layout Module Level 1.

There might be more issues with the spec. What I think might be a good next step is to try and make a script that tries to consume tspans with the ruby css values and outputs SVG 1.1 (or non-autowrapped) text, as a prototype.

therahedwig commented 2 years ago

Ok, I managed to get parsing of most of the typical cases done: ruby_in_svg_basic_parsing.zip

Some problems:

  1. While parsing the display property went fine, positioning them with x, dx, dy and y, while keeping the DOM-order was very tricky, and I gave up with the interleaved case. I wouldn't want to wish this on anyone, and I hope we can use these display values so the positioning is something that's only of concern to the renderer and text-selection mechanism.
  2. That said, there's no ruby-parenthesis value, because the ruby working group decided that 'display:none' (item 3) would be sufficient, probably thinking of a case where it's combined with specific xml elements. Problem is, an SVG renderer might know 'display:none', but not all of these 'display:ruby-' properties, making this useless for marking up fallback parentheses. Related, SVG doesn't have content, ::before and ::after selectors, so dynamically generating parentheses for certain situations, as suggested by the ruby spec doesn't work either. -- The lack of ruby-parentheses display value could be solved by either requesting this property value or by defining ruby xml elements for SVG.
  3. Rule 2 of the anonymous box generation is kinda scary. It would mean that a span with 'ruby-base' parented by a span with 'ruby-text-container' would result in the 'ruby-base' span being wrapped in a 'ruby' container, leading to nested Ruby. Acceptable in HTML right now, but for SVG I'd much rather interpret unexpected display values as 'inline' to prevent having to puzzle out the mysteries of nested-tables.
  4. This one's kinda minor, but there's no way to offset Ruby from their base. It seems the spec authors decided that using margin/padding would be best used for this, but this cannot be used in SVG: It doesn't have the CSS-box-model.

Other notes:

  1. Similar to superscript, the display properties don't indicate default font-size. Authoring tools should probably look at the default classes and offer those to users.
  2. Note that Text-emphasis-marks interact with Ruby, but the ruby spec doesn't mention this. I've had real-world conversations where both are confused, so I'd recommend looking at it too.
  3. A significant portion of the ruby spec talks about spanning annotations. These only happen in XHTML, and the ruby spec doesn't define a css property for it, so we shouldn't have to worry about them.
  4. Bopomofo handling is still very experimental. Especially the tone-marks handling is semi-unsolved. This seems to be compounded by the fact that apparently it's not just tone-marks, but also what Unicode refers to as 'final' letters (In CLReq named 'dialectal checked tones'). It's still up in the air whether the text-layout or the font is responsible for positioning these marks correctly, though there have been fonts made that can place them already.
  5. related to that, SVG can officially not change writing mode in the middle of a text because it only has inline text. inter-character ruby is treated as inline-block, and one might need to double-check how things interact with inline-block for handling inter-character. For example, for letter-spacing 'inline-block' is treated as a single typographical character.

Anyway, please check out the samples and whether they're acceptable. I'll wait with providing feedback to the CSS issue tracker until people have had a chance to think about this here.