microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
95.41k stars 8.3k forks source link

The text layout code is expecting the quantity of glyphs is equal to the cells allocated for each line #3546

Closed reli-msft closed 3 months ago

reli-msft commented 4 years ago

The issue is found in CustomTextLayout:

Instead of carefully process the clusters input, CustomTextLayout drops nearly all the information from clusters, and only keeping how many columns a cluster occupy, and later it is expecting that the glyph quantity equals to the quantity of clusters. This may lead to incorrect rendering result for many cases (complex script, ligatures, etc.)

Bug #610, #696 may be caused by this.

reli-msft commented 4 years ago

Running the following script in Node.JS shows a lot of problems in current text layout code:

console.log(`
Combining Diacritics    | [a\u0300\u0304]
Polytonic Greek         | [ᾊ] [Α\u0313\u0300\u0345]
Ideographic Variation   | [東京都葛\u{E0101}飾区] [奈良県葛\u{E0100}城市]
Composite Hangul        | [\u1100\u116A\u11B3]
Combining Dakuten       | [か\u309A]
Emoji vs Text           | [\u2602] [\u2602\uFE0E] [\u2602\uFE0F]
Emoji                   | [\u{1F469}]
Emoji with Skin Tone    | [\u{1F469}\u{1F3FB}]
Emoji Variation and ZWJ | [\u{1F469}\u{1F3FB}\u200D\u{1F4BB}]
`)

Terminal results: image

Chrome Dev Tools results: image

Edge F12 results: image

Word results: image

skyline75489 commented 4 years ago

Possibly also #2191

skyline75489 commented 4 years ago

Let me add Firefox result here, so that people don't forget that there's a browser out there called "Firefox":

图片

DHowett-MSFT commented 4 years ago

@skyline75489 my hero

skyline75489 commented 4 years ago

@DHowett-MSFT my man. I'm seeing regression on current master https://github.com/microsoft/terminal/commit/306e75163929161b130872dcae890686da57790d

图片

skyline75489 commented 4 years ago

I've used git bisect and found that ddcc06e911a18250c8544c7ea83011451e8256a2 is the first bad commit. I've got no idea about what #3474 even is..

skyline75489 commented 4 years ago

I've tried this on my laptop and sadly the regression is the same:

图片

Emoji is noticeably broken. Japanese characters are different, too.

zadjii-msft commented 4 years ago

@skyline75489 Thanks for the report! I'm going to move that regression to it's own thread (#3553), considering you've narrowed the regression down to a particular commit. Broadly our text rendering needs help with M chars->N glyphs scenarios like this issue describes, but we probably shouldn't have things getting worse on us.

reli-msft commented 4 years ago

This is heavily related to this issue: https://github.com/microsoft/terminal/issues/780 There's WriteCharsLegacy and Terminal::_WriteBuffer, and they should be unified into one function that writes char stream to the back buffer with proper itemization.

skyline75489 commented 4 years ago

Without #780 we can still improve the current implementation like #3578 . I'm not sure whether that's worth the effort, though. It's more like scratching the surface.

reli-msft commented 4 years ago

@skyline75489 We need to change the way we write buffers to perform correct line wrap and cursor movement. For WT there are two buffers being used: one for CONHOST, which is the "Ground Truth", and one for WT, and the write subroutine is written differently.

So what's wrong with WriteCharsLegacy? Well...

And to provide a proper text layout, cells should...

skyline75489 commented 4 years ago

@reli-msft Gotta admit, the whole Unicode experience is harder to fix than I initially anticipated. I'm not sure the exact plan Console team have. But I'm positive that we are heading to the right direction.

reli-msft commented 4 years ago

@skyline75489 Everybody are underestimating how complex text could be.

Let me count all the features that a proper rich text box should support:

And what they can choose to support:

skyline75489 commented 4 years ago

@reli-msft I'd trust you with the whole Unicode thingy myself. I'm a real rookie on this. You can tell from my PR ... Anyway let @DHowett-MSFT take us to where we should eventually be.

DHowett-MSFT commented 4 years ago

Latest results from master

image

reli-msft commented 4 years ago

@DHowett-MSFT Looks better... Sort of. Still needs to update our knowledge about Hangul Jamos' width

DHowett commented 3 months ago

You know, with the completion of #16916 I believe we've finally cracked this nut.

Image