aligrudi / neatvi

A small vi/ex editor for editing UTF-8 text
http://litcave.rudi.ir/
305 stars 25 forks source link

Hebrew diacritical marks do not render #58

Closed mcookly closed 1 year ago

mcookly commented 1 year ago

Even with the patch recommended in issue #1, NeatVI renders Hebrew diacritical marks as �, so

קֵן לַצִּפּוֹר

renders in NeatVI as

ק�ן ל�צ��פ�ו�ר

The terminal font (Cousine) can render these diacritical marks perfectly. I've also tested this in various terminals (Kitty, Alacritty, Terminal.app), and they all exhibit this issue. (I'm testing with the bicon test file.)

Thanks for the awesome work!

aligrudi commented 1 year ago

Max Cook @.***> wrote:

Even with the patch recommended in issue #1, NeatVI renders Hebrew diacritical marks as �, so

קֵן לַצִּפּוֹר

renders in NeatVI as

ק�ן ל�צ��פ�ו�ר

The terminal font (Cousine) can render these diacritical marks perfectly. I've also tested this in various terminals (Kitty, Alacritty, Terminal.app), and they all exhibit this issue. (I'm testing with the bicon [test file](https://github.com/behdad/bicon/blob/master/testtext/kan zipor).)

Thanks for the awesome work!

Neatvi renders zero-width and combining characters as �. To override that, add those characters to the placeholders[] array in conf.h. For instance, the following entry replaces every occurrence of character x with character y, and the width of y on the screen is 1.

.. } placeholders[] = { ... {"x", "y", 1}, };

Ali
mcookly commented 1 year ago

Thanks for the quick response! Diacritical marks are now rendered, but they are offset to the left by one character.

offset_issue

Currently, a Hebrew diacritical mark placeholder looks like this in my code:

{"֑", "֑", 1}, // U+0591

Am I missing some character to fix the offset?

aligrudi commented 1 year ago

Max Cook @.***> wrote:

Currently, a Hebrew diacritical mark placeholder looks like this in my code:

{"֑", "֑", 1}, // U+0591

Am I missing some character to fix the offset?

The third field of this struct must match the width of the character on the screen; since it combines with its preceding character, its width must be zero. However, neatvi does not show zero-width characters. For Arabic, we place a tatweel ـ or dotted circle ◌ before the combining character (see the default entries of placeholders[]).

The easiest solution is to use a similar solution for Hebrew. However, there is a terminal issue (bug?) that makes the make the combining character appear on the wrong character (see issue #54).

When writing Neatvi, I decided not to render two characters on the same screen position (like combining characters or ligatures). One solution is to modify led_render() to allow multiple characters on the same screen position. I do not like it though, because it would make editing a text with such characters more difficult.

Ali
mcookly commented 1 year ago

Implementing the changes in issue #54 does correct the position of diacritical marks in Alacritty (which does not support ligatures by default), but terminals with proper diacritical support still display the marks on the wrong character. I'll have to look more into this at some point.

I'll close this since my question was answered. If I find something else, I'll drop a comment or reopen. Thanks again!