aligrudi / neatvi

A small vi/ex editor for editing UTF-8 text
http://litcave.rudi.ir/
305 stars 25 forks source link

Poor Arabic Support. #54

Open LaithOsama opened 1 year ago

LaithOsama commented 1 year ago

Hello Bro. as an Arabic user I have to use some special Arabic symbols like ( ْ ِ ٍ ٌ ُ ً َ ), for some reason in Neatvi It's rendered as (_). Example: دَعوني أُوَفّي السَيفَ في الحَربِ حَقَّهُ وَأَشرَبُ مِن كاسِ المَنِيَّةِ صافِيا وَمَن قالَ إِنّي سَيِّدٌ وَاِبنُ سَيِّدٍ فَسَيفي وَهَذا الرُمحُ عَمّي وَخالِيا In Neatvi It's: image or: ـﻪــﻘـﺣ ـبﺮـﺤﻟا ﻲﻓ ـﻒﻴـﺴﻟا ﻲـﻓـوـأ ﻲﻧﻮﻋـد ﺎﻴـﻓﺎﺻ ـﺔــﻴـﻨـﻤﻟا ـسﺎﻛ ﻦـﻣ ـبـﺮﺷـأـو ـﺪــﻴـﺳ ـﻦﺑـاـو ـﺪــﻴـﺳ ﻲـﻧـإ ـلﺎﻗ ﻦـﻣـو ﺎﻴـﻟﺎﺧـو ﻲـﻤـﻋ ـﺢﻣـﺮﻟا اﺬـﻫـو ﻲﻔﻴـﺴـﻓ Sadly, I don't know C, so I can't figure out what is the problem

aligrudi commented 1 year ago

Hi,

Laith @.***> wrote:

as an Arabic user I have to use some special Arabic symbols like ( ْ ِ ٍ ٌ ُ ً َ ), for some reason in Neatvi It's rendered as (_). Example: دَعوني أُوَفّي السَيفَ في الحَربِ حَقَّهُ وَأَشرَبُ مِن كاسِ المَنِيَّةِ صافِيا وَمَن قالَ إِنّي سَيِّدٌ وَاِبنُ سَيِّدٍ فَسَيفي وَهَذا الرُمحُ عَمّي وَخالِيا In Neatvi It's: image or: ـﻪــﻘـﺣ ـبﺮـﺤﻟا ﻲﻓ ـﻒﻴـﺴﻟا ﻲـﻓـوـأ ﻲﻧﻮﻋـد ﺎﻴـﻓﺎﺻ ـﺔــﻴـﻨـﻤﻟا ـسﺎﻛ ﻦـﻣ ـبـﺮﺷـأـو ـﺪــﻴـﺳ ـﻦﺑـاـو ـﺪــﻴـﺳ ﻲـﻧـإ ـلﺎﻗ ﻦـﻣـو ﺎﻴـﻟﺎﺧـو ﻲـﻤـﻋ ـﺢﻣـﺮﻟا اﺬـﻫـو ﻲﻔﻴـﺴـﻓ Sadly, I don't know C, so I can't figure out what is the problem

Neatvi uses placeholders for some characters (especially those that cannot be printed); the user can specify a list of such characters (see the placeholders array in conf.h). For Arabic combining characters (vowels), Neatvi inserts Keshideh characters just before them so that it is easier to search for, add or remove them (without it, multiple characters are rendered in the same screen position).

They are probably not rendered correctly in your terminal. Do you get the same result in other terminals? Note that you may need to redefine LNPREF (see conf.h).

Ali
LaithOsama commented 1 year ago

I just test it out in some fancy terminals that support ligature such as alacritty, The result is better: image I found that placeholders array in conf.h, but I can't figure out the format to specify the list of characters, can you give my an example ? Whatever, you did amazing work bro, this is the best vi clone ever exist and with bidi support, a dream come true to many of us. God bless you.

aligrudi commented 1 year ago

Laith @.***> wrote:

I just test it out in some fancy terminals that support ligature such as alacritty, The result is better: image

This does not seem entirely correct though. The terminal is rendering the combining characters in the wrong place. They should have been placed on Keshide (Tatweel) characters (not on its preceding characters). I am not sure if we can fix this though.

I found that placeholders array in conf.h, but I can't figure out the format to specify the list of characters, can you give my an example ? Whatever, you did amazing work bro, this is the best vi clone ever exist and with bidi support, a dream come true to many of us.

The first field is the UTF-8 character. The second is its replacement. The third field is the length of the replacement on the screen. For instance, after adding the following entry, → is rendered as ->.

{"→", "->", 2},

God bless you.

Thanks.

Ali
LaithOsama commented 1 year ago

I understand what you mean, but I have tried it in other terminals and got the same result, I can try it in a VTE based terminals that support bidi if you want.

also I don't think specify a list of characters in the placeholders array can help in my case, IDK probably I did it wrong. Whatever, I am also not sure if we can fix this, should I close this issue or there is something else to do ? Thank you.

aligrudi commented 1 year ago

Laith @.***> wrote:

I understand what you mean, but I have tried it in other terminals and get the same result, I can try it in a VTE based terminals that support bidi if you want.

This is just how they render it: if the combining character comes after a Keshide, it is rendered on the character that precedes the Keshide.

also I don't think specify a of characters in the placeholders array can help in my case, IDK probably I did it wrong. Whatever, I am also not sure if we can fix this, should I close this issue or there is something else to do ?

The function ren_placeholder() in ren.c is where the Keshide character is inserted before combining characters (have a look at the if block with the uc_iscomb(s) condition). So if this condition is removed and the combining character is not listed in the placeholders array, "�" is printed instead (Neatvi renders nonprintable and zero-width characters as "�").

It is probably more transparent if we remove the condition in ren_placeholder(); we can add entries to the placeholders array instead. Something like this (this is what is done now; Neatvi places a Keshide before each combining character):

{"ّ", "ـّ", 1},
{"َ", "ـَ", 1},
...

Any suggestions for alternatives that look better?

Ali
aligrudi commented 1 year ago

Laith @.***> wrote:

I understand what you mean, but I have tried it in other terminals and get the same result, I can try it in a VTE based terminals that support bidi if you want.

I just checked. The behaviour of these terminals is inconsistent with graphical frameworks; they render the combining character on the Tatweel and not on its preceding character. See how your browser renders سـَه. I have not checked which one is correct according to the standard though.

also I don't think specify a of characters in the placeholders array can help in my case, IDK probably I did it wrong. Whatever, I am also not sure if we can fix this, should I close this issue or there is something else to do ?

Another alternative is using ◌ instead of _. You can test it by adding the following entries to placeholders array:

{"ً", "◌ً", 1},
{"ٌ", "◌ٌ", 1},
{"ٍ", "◌ٍ", 1},
{"َ", "◌َ", 1},
{"ُ", "◌ُ", 1},
{"ِ", "◌ِ", 1},
{"ّ", "◌ّ", 1},
{"ْ", "◌ْ", 1},
{"ٓ", "◌ٓ", 1},
{"ٔ", "◌ٔ", 1},
{"ٕ", "◌ٕ", 1},
{"ٰ", "◌ٰ", 1},

However, the terminal that I tried (and probably others) render the combining character one the character that precedes ◌, probably by mistake.

Ali
LaithOsama commented 1 year ago

Sorry for my late reply, I had some exams to complate. However, removing that condition in ren_placeholder() and specify a list of entries in placeholders array instead, made me able to write combining characters and render them correctly, sadly without Keshide (Tatweel). The result: image Placeholder array: {"ً" ,"ً", 1}, {"ٌ" ,"ٌ", 1}, {"ٍ" ,"ٍ", 1}, {"َ" ,"َ", 1}, {"ُ" ,"ُ", 1}, {"ِ" ,"ِ", 1}, {"ّ" ,"ّ", 1}, {"ْ" ,"ْ", 1}, {"ٓ", "ٓ", 1}, {"ٔ" ,"ٔ", 1}, {"ٕ", "ٕ", 1}, {"ٰ", "ٰ", 1},

I tried out both "_" and "◌" (the alternatives you suggested), none of them helped, I test them in many terminal like st, alacritty and fbpad with no luck. In fact this is the result I wanted, feel free to close the issue. God (Allah) bless you brother, Thank you very much.

LaithOsama commented 1 year ago

Update: I tested it further today and it seems to insert spaces at the beginning of a line every time I type those characters.

aligrudi commented 1 year ago

Laith @.***> wrote:

Update: I tested it further today and it seems to insert spaces at the beginning of a line every time I type those characters.

Sorry for my delayed answer.

The problem is that the length of each placeholder is zero but you specified one (the third field of the struct); this field specifies how many characters on the screen the placeholder uses.

If you specify zero as their length, these characters would not be visible at all.

Ali