aligrudi / neatvi

A small vi/ex editor for editing UTF-8 text
http://litcave.rudi.ir/
305 stars 25 forks source link

ren_placeholder() is bugged with combining characters #42

Closed kyx0r closed 2 years ago

kyx0r commented 2 years ago

Hello Ali, this question has been on my mind for a very long time parlially because I can't understand why it happens. To be fair this likely isn't a problem for the general user, that being a rare edge case where the text rendered by led_printparts() gets printed on the next line if there is some kind of special CR2L character in the line. The cursor drops down a line. I can't understand what is causing this, maybe the -1 row parameter to led_print() does not do what's expected if there is some kind of special character in the line?

define LNPREF does not help, and it makes sense cause I don't use terminal like that

To reproduce goto conf.h, find the line that says CR2L, enter insert at the 0 col, insert a bunch of tabs until the line goes over the screen.

The result should look like this:

https://kyryl.tk/neatvi_led_print.png https://kyryl.tk/nextvi_led_print.png https://kyryl.tk/if_keep_inserting_tabs.png

Nextvi does string rendering in many ways differently (as it's more efficient) than neatvi but as you can see it also happens there. It probably doesn't need a fix as this edge case seems to only be caused by that CR2L string only, I just want to know why it happens and If can be easily fixed?

kyx0r commented 2 years ago

I am not currently using fbpad, but I tested it there some time and same thing happens.

kyx0r commented 2 years ago

Okay yes I know kinda understand what the problem is, the last visible character is combining, simply omitting the last character solves the problem, so actually this is a bug where vi tries to render a few bytes more than there is on the screen. I think Nextvi will have this bug fixed soon, working on it now that I actually found the cause.

kyx0r commented 2 years ago

Ok, why is the combining character implemented like that? ren_placeholder() is bugged and will return more characters than needed into term_commit() resulting in this incorrect string rendering.

Take a look at my commit 8f788fede224a1e60d0abf7eabf2565b74f0d679 with a proposed fix, it works just fine. When we try to render the last possible comb character we must return only 1 byte, so that there is no excess spill.

kyx0r commented 2 years ago

Ok well, it fixed the problem on one terminal but still does not work in tty, it seems like the terminal decides to render the combining characters however the hell it wants? Like in tty it thinks that the special character hexed 0640 is the width of 2 chars? As a result it completely throws off the result and cursor position mismatches the actual character.

kyx0r commented 2 years ago

Ok, I think it makes sense better now. It's like that because tty does not actually even try to combine the characters, so it's valid that it ends up being 2 chars. Nonetheless this is still utterly wrong. Is there a way to figure out if terminal can combine the chars?

kyx0r commented 2 years ago

I've just checked fbpad again and it works correctly without a patch there. That just tells me that st terminal simply does not render combining chars correctly if it gets a combining char on the boundary, ie the last visible col. The patch worked because nextvi did essentially what the terminal should of done.

What should I do Ali? Report it to Hiltjo in hopes it gets fixed in the terminal or keep the patch in vi?

kyx0r commented 2 years ago

I think we can make an ex option to disable the dialectical mark, so that users from tty don't get all the junk extra characters, or even just test using istty and set the variable automatically?

aligrudi commented 2 years ago

Hi,

Kyryl @.***> wrote:

I think we can make an ex option to disable the dialectical mark, so that users from tty don't get all the junk extra characters, or even just test using istty and set the variable automatically?

The width of a combining character is zero when rendered, and zero-width characters are not very convenient in an editor. If a character has a combining character attached to it, then it is not possible to remove or replace the combining character alone. In general, for any zero-width and combining characters, Neatvi shows the unknown character mark (for example à), unless it is listed in placeholders array. For Arabic combining characters it places a Keshideh character before them, so that they can be edited individually. If the terminal does not support the correct rendering of combining characters, this would not be rendered correctly. If you want to disable that, you can comment the "if (uc_iscomb(s))" block in ren_placeholder.

Ali
kyx0r commented 2 years ago

Ali Gholami Rudi @.***> wrote:

Hi,

Kyryl @.***> wrote:

I think we can make an ex option to disable the dialectical mark, so that users from tty don't get all the junk extra characters, or even just test using istty and set the variable automatically?

The width of a combining character is zero when rendered, and zero-width characters are not very convenient in an editor. If a character has a combining character attached to it, then it is not possible to remove or replace the combining character alone. In general, for any zero-width and combining characters, Neatvi shows the unknown character mark (for example a*), unless it is listed in placeholders array. For Arabic combining characters it places a Keshideh character before them, so that they can be edited individually. If the terminal does not support the correct rendering of combining characters, this would not be rendered correctly. If you want to disable that, you can comment the "if (uc_iscomb(s))" block in ren_placeholder.

Ali

Hmm, I figured it out yesterday. What do you think, should I report a bug with st terminal though? It seems like st terminal does not stop combining the character at the last visible col, resulting in displaying corrupted lines. Almost like it's trying to split the excess part somehow, but it's just wrong. I won't think it was intentional. But also I can just keep the vi workaround as it's simple cause it's just 1 if statement. I highly doubt suckless would try to fix the bug even if you pin it right in their face, it be too difficult to explain anyways.

aligrudi commented 2 years ago

Kyryl @.***> wrote:

Hmm, I figured it out yesterday. What do you think, should I report a bug with st terminal though? It seems like st terminal does not stop combining the character at the last visible col, resulting in displaying corrupted lines. Almost like it's trying to split the excess part somehow, but it's just wrong. I won't think it was intentional. But also I can just keep the vi workaround as it's simple cause it's just 1 if statement. I highly doubt suckless would try to fix the bug even if you pin it right in their face, it be too difficult to explain anyways.

Do combining diacritical marks work in st for the case you mentioned? You can fill a line with à to test that.

Ali