go-text / typesetting

High quality text shaping in pure Go.
Other
88 stars 11 forks source link

How to render vertical texts #111

Closed hajimehoshi closed 6 months ago

hajimehoshi commented 7 months ago

Hi, I'm using this library for my game engine. This is awesome!

I have several questions about vertical texts:

(The left-bottom texts are in Mongolian. This rendering result is without Vertical_Orientation) image

(This rendering result is with Vertical_Orientation) image

Thanks!

hajimehoshi commented 7 months ago

Rendering a Mongolian text in an horizontal direction seems fine by the way

image

benoitkugler commented 7 months ago

Hi ! Thank you for your interest in this package.

I think rendering vertical text has not been tested thoroughly yet, but it is definitively an area we would like to improve ! I'll dig deeper in your issue; thanks for the detailed problem statement.

PS : As an aside, I've skimmed your project issue tracker, and found https://github.com/hajimehoshi/ebiten/issues/788. Perhaps you will be interested by the fontscan (and its FontMap type) that you could use to access system fonts.

hajimehoshi commented 7 months ago

Thank you for your quick response!

Perhaps you will be interested by the fontscan (and its FontMap type) that you could use to access system fonts.

Thanks, I'll take a look!

benoitkugler commented 7 months ago

From a first research, I've understood that :

So basically, go-text/harfbuzz should do all the work.

Could you paste the Mongolian and Japanese strings you have used so that I can reproduce and hopefully fix the issue ?

hajimehoshi commented 7 months ago

Thank you for taking a look!

Could you paste the Mongolian and Japanese strings you have used so that I can reproduce and hopefully fix the issue ?

Here you are:

"ᠬᠦᠮᠦᠨ ᠪᠦᠷ ᠲᠥᠷᠥᠵᠦ ᠮᠡᠨᠳᠡᠯᠡᠬᠦ\nᠡᠷᠬᠡ ᠴᠢᠯᠥᠭᠡ ᠲᠡᠢ᠂ ᠠᠳᠠᠯᠢᠬᠠᠨ"
"あのイーハトーヴォの\nすきとおった風、\n夏でも底に冷たさを\nもつ青いそら…"
andydotxyz commented 7 months ago

I will have to learn about this to update the go-text/render output as well, as I think it is just horizontal...

benoitkugler commented 7 months ago

Here is what I found after more digging.

There is an ambiguity when we say "vertical text" : it may be that we want to render glyphs upright, on top of each other; or that we want to rotate glyphs (that is, basically, to render them as in horizontal mode, and then rotate the whole line). Following the CSS spec, we can use the wording upright for the former, sideways for the later.

Depending on the context and script, both ways may be "acceptable", or sometimes one way is more natural than the other. Here are whats Firefox renders :

Screenshot 2023-11-26 at 16-56-40 Screenshot

HTML source ```html
Script
Horizontal
Upright
Sideways
Browser default
Latin
ABCD
ABCD
ABCD
ABCD
Mongolian
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
Japanese
もつ青いそら…
もつ青いそら…
もつ青いそら…
もつ青いそら…
```

Now, notice that Harfbuzz has no precise notion of this difference : using a direction of TopToBottom will generate the upright form, never the sideways one.

So my first conclusion is that we should support the two ways. The upright mode is already covered by Harfbuzz. A sketch of implementation for the sideways one would be to 1) shape horizontally 2) rotate the whole output. This means that we will have to add a flag to glyphs indicating that their content should be rotated when rendering (so that go-text/render and other consumer may properly rotate the glyph)

A second step would be to select the "best way" to render vertical text, given an input text, that is, matching what web browsers do by default. This is where the famous Unicode Vertical_Orientation table is useful: if I'm not mistaken, it gives us a way to segment the text according to the U or Tu (=upright) / R or Tr (=sideways) alternative.

Side note : the upright rendering of the Mongolian text using Harfbuzz seems indeed "broken". I've checked the NotoSansMongolian font and found that the vertical advance values are constant. It means that the font is not designed for this rendering. By the way, you can see that Firefox doesn't even support it !

hajimehoshi commented 7 months ago

Thank you for taking a look!

"…" (U+2026) in the all above figure seems not rendered expectedly at least to me. Probably the browser behaves correctly as they prefer English fonts for this glyph for "…". If a Japanese font is used, it should have its own special vertical glyph for "…", as I shown in the first comment. "…" is grouped as "R" in the Unicode spec, but this actually should NOT be rotated when a Japanese glyph is used. If CSS font-family prefers a Japanese font over others, the result should be changed. This is complicated!

Yeah, I feel like the second step seems the best. As I said above, some glyphs should not be rotated against the Unicode spec, so

func shouldRotateGlyph() bool {
    // This is for "…" for example
    if theGlyphIsAlreadySpecialVerticalGlyph {
        return false
    }
    if theUnicodePropertyIsROrTr {
        return true
    }
    return false
}

would be what we need. What do you think?

Side note : the upright rendering of the Mongolian text using Harfbuzz seems indeed "broken". I've checked the NotoSansMongolian font and found that the vertical advance values are constant. It means that the font is not designed for this rendering. By the way, you can see that Firefox doesn't even support it !

When rotating glyphs, shouldn't horizontal advances be used for them instead of vertical ones? It makes sense if the Monglian font doesn't have vertical advance info as the font has only horizontal glyphs.

benoitkugler commented 7 months ago

When rotating glyphs, shouldn't horizontal advances be used for them instead of vertical ones? It makes sense if the Monglian font doesn't have vertical advance info as the font has only horizontal glyphs.

That will be the case using the "sideways" mode (like browsers do), but the "upright" mode will stay broken, and I think the real solution would be for the font author to include proper vertical metrics ('vtmx' table).

benoitkugler commented 7 months ago

"…" (U+2026) in the all above figure seems not rendered expectedly at least to me. Probably the browser behaves correctly as they prefer English fonts for this glyph for "…". If a Japanese font is used, it should have its own special vertical glyph for "…", as I shown in the first comment. "…" is grouped as "R" in the Unicode spec, but this actually should NOT be rotated when a Japanese glyph is used. If CSS font-family prefers a Japanese font over others, the result should be changed. This is complicated!

Yeah, I feel like the second step seems the best. As I said above, some glyphs should not be rotated against the Unicode spec, so

func shouldRotateGlyph() bool {
    // This is for "…" for example
    if theGlyphIsAlreadySpecialVerticalGlyph {
        return false
    }
    if theUnicodePropertyIsROrTr {
        return true
    }
    return false
}

would be what we need. What do you think?

Yes, you have nicely sum up the issue, and I overall agree with your pseudo-code solution. However, I don't think there is a proper way to detect theGlyphIsAlreadySpecialVerticalGlyph (without shaping early on), so we will instead use the context. In your example, we would associate the ... with the Japanese script, and thus use "upright" mode. This means we can't rely only on the Vertical_Orientation table, and we will need some other heuristics. Maybe, since there is not too many scripts using vertical layout, can we hard code some behaviors per script.

hajimehoshi commented 7 months ago

I tested the HTML with a Japanese font (Noto Sans CJK JP), and the default rendering result worked well on my Chrome. On Firefox, the results seems a little suspicious. In Chrome, the browser default seems expected to me, since alphabets are rotated, and a special vertical glyph is used for "…". I don't know how browser works, but as you explained, emulating this behavior is pretty difficult, right?

Chrome: image

Firefox: image

HTML ```html
Script
Horizontal
Upright
Sideways
Browser default
Latin
ABCD
ABCD
ABCD
ABCD
Mongolian
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
ᠬᠦᠮᠦᠨ ᠪᠦᠷ
Japanese
ABCDもつ青いそら…
ABCDもつ青いそら…
ABCD青いそら…
ABCD青いそら…
```

That will be the case using the "sideways" mode (like browsers do), but the "upright" mode will stay broken, and I think the real solution would be for the font author to include proper vertical metrics ('vtmx' table).

I see. I don't think I need 'upright' mode for Mongolian scripts in any environments, so I am OK.

Yes, you have nicely sum up the issue, and I overall agree with your pseudo-code solution. However, I don't think there is a proper way to detect theGlyphIsAlreadySpecialVerticalGlyph (without shaping early on), so we will instead use the context. In your example, we would associate the ... with the Japanese script, and thus use "upright" mode. This means we can't rely only on the Vertical_Orientation table, and we will need some other heuristics. Maybe, since there is not too many scripts using vertical layout, can we hard code some behaviors per script.

Interesting. From the above result, I thought Chrome has a special magic that does theGlyphIsAlreadySpecialVerticalGlyph, but I am not sure. If we don't follow the Chrome way, maybe is this the compromise?

func shouldRotateGlyph() bool {
    if theFontUsesVertFeature {
        return false
    }
    if theUnicodePropertyIsROrTr {
        return true
    }
    return false
}

In this case, "ABCD" in the Japanese text in the above example's "browser default" would not be rotated. Well, I think that's fine, though that's not the best...

khaledhosny commented 7 months ago

You can check of the vert feature applies to the glyph of a Tr character and not rotate it, here is how it is done in LibreOffice https://git.libreoffice.org/core/+/refs/heads/master/vcl/source/gdi/CommonSalLayout.cxx#178.

benoitkugler commented 7 months ago

You can check of the vert feature applies to the glyph of a Tr character and not rotate it, here is how it is done in LibreOffice https://git.libreoffice.org/core/+/refs/heads/master/vcl/source/gdi/CommonSalLayout.cxx#178.

Thank you for the nice pointer ! At a first glance, I'm a bit reluctant to go down this road, because I'm afraid about the possible substitutions, and also because Harfbuzz may synthesize its own replacement if the font has no 'vert' table. But perhaps it does not matter in practice ?

(Also, generally speaking, I would like to avoid interfering with Harfbuzz and avoid guessing what it will do or not... )