xtermjs / xterm.js

A terminal for the web
https://xtermjs.org/
MIT License
17.46k stars 1.62k forks source link

Incorrect reflow when a CJK character is after line wrap and a null cell before line wrap #3097

Open mmis1000 opened 4 years ago

mmis1000 commented 4 years ago

Details

Steps to reproduce

  1. run echo -e "中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文\x1b[1C中文" under bash (not on windows, because windows seems handle reflow itself, and it did it correctly)
  2. trim the terminal width until it is short enough and wrap the last 中文 into next line and the previous line is on the end of terminal
  3. widen the terminal to original size

Expect

You see 中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文 中文 There is a null cell in the line, because it is what you print

Actually

You get 中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文中文 The null cell is gone.

Note

It seems the BufferLine didn't record how long until the line end when the wrap happened. A chinese character is 2 cell width, so it can start the wrap at 79 instead of 80. Xterm.js didn't record that. Make wrap at col 80 with a empty cell before it and wrap at col 79 have identical output, thus the issue.

Tyriar commented 4 years ago

Well this is unfortunate, currently there are 2 "null cells"; the one after a wide char and one where nothing exists, not to be confused with a regular whitespace cell. This doesn't match any of them so it seems like we might need another one, but it's an obscure edge case and I think it would end up costing us an additional bit in the buffer which I don't know if we can afford.

mmis1000 commented 4 years ago

I have a idea.

Instead of implicitly assume last null cell of previous line is caused by CJK. We can add a field to the BufferLine to explicitly declare how many cell did not count as previous line. Probably wrapPosition: 1 or so ? (0 most of time, 1 cause by CJK wrap berfore line end)

And the reflow logic can make use of it to decide whether last cell of previous line count or not.

This may also simplify the process of join wrapped lines back to original line, because you no longer need to guess whether the last cell count or not. Just drop wrapPosition cell and concat it will give you the correct result.

Currently, there is a complex function exist only to guess this. https://github.com/xtermjs/xterm.js/blob/master/src/common/buffer/BufferReflow.ts#L206-L220

This can be removed completely if we already has it recorded.

jerch commented 4 years ago

We basically lost the information about early wrap-around cells at that stage, and TerminalHandler.print simply clears those "excess" cells as NULL with a width of 1. This currently only happens for wide chars, but would get even more problematic once we would support grapheme clusters.

Imho a proper fix to that would be another null cell type, that explicitly marks excess cells from early wraparound as overwritable by reflow, while normal null cells in wrapped lines can be accounted as with width of 1.

@mmis1000 Saving the cursor position on the line where the wrap happened, can work around that issue in this particular case, but will introduce alot of line state handling issues later on (that state has to be lifted somehow later on again, otherwise reflow on recycled lines will go bonkers).

jerch commented 1 year ago

Note: I had the same issue in the weblinks rewrite, currently also forgetting a NULL cell on purpose in last cell on wrapped lines.

Not yet really addressing the issue, but I think we could solve it by introducing an explicit bit in Content for the special NULL meaning due to early wrapped wide chars. (Should have low to no impact on overall processing speed, but needs to get applied to several actions like reflow, serialize, link matching, search&highlight.)