wez / wezterm

A GPU-accelerated cross-platform terminal emulator and multiplexer written by @wez and implemented in Rust
https://wezfurlong.org/wezterm/
Other
16.66k stars 745 forks source link

Support for Devanagari / displaying text in Bangla #1333

Open wez opened 2 years ago

wez commented 2 years ago

Discussed in https://github.com/wez/wezterm/discussions/1332

Originally posted by **frogtile** November 21, 2021 How should i configure Wezterm to be able to view text in Bangla with proper rendering/layout?

Problem:

echo বাংলা ভাষা

produces incorrect spacing and some of the glyphs appear to the right of the cursor, so there is a mismatch between the terminal model and how the text renders.

I'm not an expert on this script/language, but it appears as though it has ligatures and half-width spacing. If I force the render routines to directly use the x advance metrics from harfbuz then it looks better in wezterm, but still has an issue with cursor positioning and rendering.

This needs more analysis to figure out the path forwards!

MuhammedZakir commented 2 years ago

All Devanagari-related fonts don't render properly (at least, the ones I have tried don't). I use some of them, so feel free to ping me if you need a tester.

https://en.wikipedia.org/wiki/Devanagari_(Unicode_block)

P.S. When searching for solutions, using "devanagari" may give better results.

poetaman commented 2 years ago

AFAIK, there are no monospace fonts for Indic scripts, correct me if I am wrong. Vim doesn't support it either for its difficulties. Here's a repo I found that aims to have Indic monospace fonts that work correctly in terminals, but Bengali script is not yet supported: https://github.com/monotty/fonts

wez commented 2 years ago

FWIW, neither the macOS Terminal.app nor iTerm2 are able to render any better than wezterm.

image

This is definitely a tricky one. In the screen shot you can see shaping information for the text in the OP. On the right hand side you can see a wezterm running with an internal pixel glyph positioning mode; that shows that the information in the font is sufficient to render the input text as intended.

The difficulty is this: that input text divides into 4 graphemes, with a space in the middle. Except for the space, those graphemes count as single wide according to the way that we compute the column width, giving a total of 5 columns in width.

The glyph information shows that the space has x_advance=8 (so my cells are 8 pixels wide), but the total pixel width of that sequence is 7+4+6+10+4+8+10+4+8+4 = 65 which is 8 cells wide; the text renders 3 cells wider than we think it should according to our established monospace cell width rules.

More analysis is needed to understand whether and how our width algorithm differs from eg: glibc's wcwidth and related functions--those are used by eg: editors, so if those disagree with the terminal emulator, that's another source of visual artifacts.

Then when we're sure we understand those expectations, we can try to find a way to map to it. It may require scaling down the size of some of these runs of text to fit in the terminal cells, or it may be that we expose a mode where we explicitly skip trying to map the glyphs to the terminal cell model (as shown in the picture above), but the consequence will be more discrepancies in terms of columns not lining up and so on.

Is there an example of a terminal emulator that does an excellent job with devanagari?

pranphy commented 1 year ago

Is there an example of a terminal emulator that does an excellent job with devanagari?

I have tried quite a few terminal emulators trying to get good support for devanagari script. Of all of them the one I am happy with is konsole terminal emulator.

Here is an example of the text displayed in terminal. image

For the text rendering this is almost perfect (except for small little caveats).

But I have noticed that the rendering works in certain fonts/variations only. For example this is Oblique variation of Iosevka Term font.

Here is the same file with Iosevka Term Regular image

You can see that the first word of last line didn't render properly in the Regular Variation. I suppose that has to do with font implementation rather than terminal rendering it.

Here is wezterm rendering same text with same font (Iosevka Term Regular). image

shreevatsa commented 2 months ago

There are no fixed-width fonts for Devanagari or other Indic scripts (at least, none that I'm aware of: let's just say none in widespread use that can be expected to be available on typical users' systems). It is impossible to render Indic scripts properly while trying to stay aligned to integer number of cells. The only sane solution (what we see in the konsole rendering examples above) is to abandon cells (widths that are integer multiples of some fixed width), and simply render as per the font. (IMO for anyone trying to read the text, columns not lining up is a less severe problem than the text itself being mangled.)

shreevatsa commented 2 months ago

Regarding the question of whether other terminals do better: I just made a quick post collecting some screenshots here: https://shreevatsa.net/post/terminal-indic/ — and all the ones I tried on macOS are terrible. From other (like the ones above) screenshots, it appears that konsole / gnome-terminal and (even better) mlterm are doing better, on Linux.