microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
95.44k stars 8.3k forks source link

CSI 58 (undercurl color) sequence misbehaves when in "legacy ANSI" format #17426

Open tranzystorekk opened 4 months ago

tranzystorekk commented 4 months ago

Windows Terminal version

N/A

Windows build number

10.0.22631.0

Other Software

neovim 0.10.0 zellij 0.40.1 wezterm 20240520-135708-b8f94c47

Steps to reproduce

More details are described in https://github.com/wez/wezterm/issues/5450

A short overview

In the wezterm issue mentioned above, I came across conpty handling the CS 58 (undercurl color) ANSI sequence in a way that causes visual bugs for zellij users on modern windows terminals.

In short windows implementation seems to only accept the format:

ESC[58:2::234:105:98m (colons, iterm-compatible)

While if a proxy like zellij emits a more common ANSI form:

ESC[58;2;234;105;98m (semicolons)

it leads to the terminal instead changing text background color.

Expected Behavior

I wanted to discuss a good way forward to fix this between the zellij project and the terminal implementations used on Windows. Some thoughts and options:

Actual Behavior

N/A

DHowett commented 4 months ago

@tusharsnx had some good rationale for only supporting the ITU T.418 format for underline colors - Tushar, do you remember what that was?

DHowett commented 4 months ago

Notes from #15795

This is a requirement for the implementation to avoid problems with VT clients that don't support sub parameters.

To my understanding, we explicitly do not support or emit ; because of the potential for confusing parsers who miss handling the leading 58.

Zellij should probably do the same for the same reason.

tusharsnx commented 4 months ago

@tusharsnx had some good rationale for only supporting the ITU T.418 format for underline colors

I remember blocking the sending of CSI-58 with ; because it can have unexpected outcomes for unsupported terminals, but I'm not 100% sure if parsing of such sequences is something we want to block πŸ€”

If a console application sends us CSI-58 that's a strong signal that the intention is to color the underlines regardless of if it's followed by ; or :, given that there is no secondary meaning to CSI-58 other than underline colors. I think I need @j4james help to confirm this.

If we like to enable parsing of CSI-58 (like we do for CSI-38 and CSI-48), then there's a possibility, but only if it doesn't affect unsupported terminals. We will only emit CSI-58 with : to the terminal (Conpty client) though. This will eliminate the unexpected outcomes in case an application sends CSI-58 with ; since that will be translated to CSI-58 with : and unsupported terminals should just eat it without doing anything πŸ™‚

tranzystorekk commented 4 months ago

If we like to enable parsing of CSI-58 (like we do for CSI-38 and CSI-48), then there's a possibility, but only if it doesn't affect unsupported terminals. We will only emit CSI-58 with : to the terminal (Conpty client) though. This will eliminate the unexpected outcomes in case an application sends CSI-58 with ; since that will be translated to CSI-58 with : and unsupported terminals should just eat it without doing anything πŸ™‚

I like this solution because as @imsnif mentioned in the wezterm PR, zellij would like to stay uniform and backward-compatible in its implementation, and as a terminal multiplexer it probably shouldn't be its responsibility to handle diverging parser behaviors between platforms.

j4james commented 4 months ago

For app devs (which includes zellij), I would recommend against using ITU sequences in any format unless you've somehow verified that the terminal actually supports those sequence. Both variants have the potential to break terminals that don't understand them. And if you use a DECRQSS query to check for compatibility, that should also tell you which variant the terminal supports.

That said, on our side I think it would be preferable if we supported both semicolons and colons for SGR 58, the same way we do for SGR 38 and 48. Although when it comes to passing the sequence through to conpty, I think we decided that semicolons are best for 38 and 48, since they're more widely supported, while colons were best for 58, since all terminals that support underline colors also support colons, but not all of them support semicolon (we obviously don't).

In an ideal world we should have first been testing the conpty client in the same way an app or multiplexer would, but if we have conpty passthrough soon that won't be necessary.

tusharsnx commented 4 months ago

For app devs (which includes zellij), I would recommend against using ITU sequences in any format unless you've somehow verified that the terminal actually supports those sequence.

I agree with this, since even if we enable "SGR 58 with semicolon" parsing, most terminals on Windows won't support that out-of-box (for like.. another few years). AFAIK, only Windows Terminal and WezTerm would get the support because they consume the most up-to-date Conpty implementation (from this repo), while others rely on inbox conhost.exe (that sits inside system32 dir) πŸ™. So, zellij should by default emit SGR 58 with colons, and use semicolons on a case-to-case basis.

imsnif commented 4 months ago

@tusharsnx - My reading of what @j4james wrote is that Zellij should interpret both colons and semicolons and only emit semicolons (which is what we do). And that Windows Terminal (apologies if I'm not using the right terminology, I do not know this ecosystem well) should interpret both semicolons and colons as well.

Did I misread or misunderstand you @j4james ?

j4james commented 4 months ago

My reading of what @j4james wrote is that Zellij should [...] only emit semicolons

No. I was saying Zellij should check the capabilities of the terminal before emitting anything. But if that is too much effort, then it should emit semicolons for 38 and 48, and colons for 58.

imsnif commented 4 months ago

Alright. I'm willing to issue the fix on our side. Namely as you said because terminals that support underline color necessarily support the colon format, but it would be nice if the fix would also be issued here (I guess Windows Terminal has more working hands and the issue really, as far as I know at least, doesn't happen anywhere else).

j4james commented 4 months ago

the issue really, as far as I know at least, doesn't happen anywhere else

@imsnif Just FYI, Mintty behaves the same way as Windows Terminal, i.e. SGR 38 ad 48 work with both colons and semicolons, but SGR 58 only works with colons. You can see the docs here:

https://github.com/mintty/mintty/blob/a79e7dd5acf30bf6fd7790990cbf5ffa535553ca/wiki/Tips.md#text-attributes-and-rendering

imsnif commented 4 months ago

Thanks for the correction @j4james. My expectation expressed above still stands.

DHowett commented 3 months ago

Thanks for confirming @j4james!

Since Mintty behaves the same as we do, and we're currently rewriting the ConPTY subsystem in #17510 to pass data through unchanged, we're not planning on making further changes here.

Thanks all!

imsnif commented 3 months ago

Thanks for chiming in @DHowett !

I'm happy to hear you're making this step in ConPTY.

As I mentioned in https://github.com/zellij-org/zellij/pull/3440, the "fix" on the Zellij side is a workaround for this very issue in windows terminal (and its associated software). I don't think other software having the same issue (Mintty) is relevant to the matter at hand. This workaround can potentially cause problems with other terminal emulator components (existing or future) who interpret the protocol to the letter in a different (yet still legitimate) way than those present in this discussion interpret it. In such a case, we'll have to roll back the change and we'll be back in square 1.

The way you manage issues in this repository is of course your discretion, but I feel it important to mention this fact. This issue is not resolved, it was merely worked around in the other (non-commercial, I might add) software involved.

I do not intend to participate further in this discussion and wish everyone a nice weekend and a good summer.

DHowett commented 3 months ago

Thanks for your input!

I'm not averse to reopening this issue, but I am worried I've misunderstood your position. I thought that Zellij was taking well-formed input (with colons) from connected clients and transforming it into not-well-formed output (with semicolons, and no colorspace identifier for RGB colors). It doesn't seem like a Windows Terminal-specific workaround to make it emit only standards-compliant output--rather, it is ultimately correct to do so. In so doing, Zellij takes all input whether or not it is well-formed and makes it so.

If there are terminal emulators that require semicolons for SGR 58, they're surely the outliers... right?

j4james commented 3 months ago

@DHowett If Zellij decides to go back to using the incorrect format for SGR 58 sequences, they'll definitely be in the wrong, but that doesn't mean we shouldn't try and deal with the issue ourselves. They probably won't be the only app that decides to use that format for these sequences, so I think it makes sense to be more lenient in what we're willing to accept.

And note that #17510 doesn't solve this problem. It'll fix the issue for some third party terminals, like WezTerm, which already accept both formats, but Windows Terminal will still be broken. The only advantage of passing everything through is that we no longer have to make the decision as to what format to use when forwarding these attributes.

imsnif commented 3 months ago

I'm not averse to reopening this issue, but I am worried I've misunderstood your position.

[...]

If Zellij decides to go back to using the incorrect format for SGR 58 sequences, they'll definitely be in the wrong

I'm going to keep replying here even though I said I won't. There is a small hope in me that enough relevant people would read this so as to affect a bit of a paradigm shift in an issue that is a severe pet-peeve of mine. Thanks for your patience with my sometimes strong wording. I do not mean to offend or misrepresent anyone and hope it is not taken that way.

So, to recap a bit of history so we're all on the same page: the SGR notation to set the foreground and background colors for text was added to xterm in 1999[1]. These were based on the specification in ECMA-48[2] and later ISO-8613-6[3]. Due to a misinterpretation of the language of the latter (parameter vs. parameter element), xterm implemented the SGR notation with semicolons instead of colons.

This had the theoretical effect that some parsers could misinterpret this syntax, given that this now means the placement of the parameters now had context (eg. 48;5;6m would be interpreted differently if the parser does not know what 48 means, merrily going on to interpret the 5 and 6 independently).

In practice, this had no real world effect[4]. To the best of my knowledge (and I deal with terminal compatibility on a daily basis) there is still no real world example of this causing trouble. The one exception being ironically this very issue in the context of CSI 58 (more on this below), which happened out of an adherence to the standard and a benevolent will to do good.

Other terminal emulators copied xterm, using the semicolon notation for their parsers. Even though xterm "fixed" this problem in 2012[5], they still maintained backwards compatibility to the semicolon syntax. To this day the semicolon notation is still the most widespread (no reference for this one, I hope those reading this thread deal enough with ANSI to know this as an unassailable fact - if not, I hope you'll take my word for it).

A terminal emulator author would be out of their mind not to provide support for the semicolon SGR notation in their parser. Nothing will work properly, even though they would be "correct" to do so.

Then comes the issue at hand, the SGR notation in the recent "styled underlines" protocol. Quoting the protocol regarding the 58 notation: "This works exactly like the codes 38, 48 that are used to set foreground and background color respectively."[6]

One could argue that if a terminal emulator accepts the semicolon notation for ANSI 38 and 48 but does not accept it for ANSI notation 58, they're misimplementing very explicit wording in the protocol. I am not arguing that however - my argument is different: this is all hogwash.

We're discussing standards and protocols that have all been written in the previous century. Likely before some people participating in this thread were even alive. These protocols have been misimplemented from day 1 in the context of our ecosystem (like it or not, we're all copying xterm). Their misimplementation is so widespread it is effectively the standard. We all know this.

Honestly? It doesn't matter. The only thing that matters is the experience we give our users. We want their terminal emulators to display colors properly. The standards were there to help us do so, arguably in a time where communicating and reporting issues was far more difficult than it is now - as this thread demonstrates.

In my software, I want to emit rendering instructions that will be correctly interpreted (read: show the user what I want the user to see) in as many terminals as possible. I try to do that - where possible - without relying on terminfo databases and the like, because I also want my rendering instructions to be replayable over time (which is an emerging characteristic of this rendering medium). Semicolons are the lowest common denominator for color rendition, and the fact that they keep being implemented as separators for 58 means that surprisingly enough, not everyone implementing them has a copy of ECMA-48 and ISO-8613-6 lying around. They just keep doing what they do for SGR 38 and 48 - which is what the styled underline protocol tells them to do.

If a terminal emulator doesn't parse them in the name of adhering to a patchy and discontiguous standard whose misinterpretation forms the basis of our modern terminal ecosystem - I would argue this terminal emulator is wrong and behaving incorrectly.

Thanks for reading, apologies for the strong wording and I hope no-one was offended or took this personally.

[1] - https://invisible-island.net/xterm/xterm.log.html#xterm_111 [2] - https://www.ecma-international.org/wp-content/uploads/ECMA-48_2nd_edition_august_1979.pdf [3] - https://www.iso.org/obp/ui/en/#iso:std:iso-iec:8613:-6:ed-1:v1:en (warning: paywall) [4] - https://invisible-island.net/xterm/xterm.faq.html#semicolon_vs_colon [5] - https://invisible-island.net/xterm/xterm.log.html#xterm_282 [6] - https://sw.kovidgoyal.net/kitty/underlines/

j4james commented 3 months ago

@imsnif I'm in complete agreement with you regarding doing what's best for your users rather that strictly following the standard. That's why I've repeatedly recommended the use of semicolons for SGR 38 and SGR 48 sequences. I'm not suggesting you use colons for SGR 58 simple because that's technically correct. I'm recommending colons because I think that's what is best for your users.

In practice, this had no real world effect[4]. To the best of my knowledge (and I deal with terminal compatibility on a daily basis) there is still no real world example of this causing trouble.

I can't help thinking we must be talking about different things, because I find this hard to believe. Here's a simple test case you can try out in Xterm:

printf "\e[58:5:41m\e[4mGREEN UNDERLINE (colons)\e[m\n"

printf "\e[58;5;41m\e[4mGREEN UNDERLINE (semicolons)\e[m\n"

And here's what I'm seeing:

image

Xterm doesn't support underline colors, so the colon version should just be ignored, and you get a regular underline, but the semicolon version changes the background to red and forces the text to blink. Surely that classifies as an example of "causing trouble"?

And it's not just Xterm, or Mintty, or Windows Terminal. I've seen the same or similar behavior in probably 20 or so different terminals. Sending the semicolon version of SGR 58 to terminals that don't support it is guaranteed to cause problems for a lot of people.

That said, I also know of terminals that will have problems with either version, which is why I've recommended not using these sequences at all if you haven't first verified the terminal supports them. But if you aren't going to do that, using the colon version seems like the least worst option.

So my question for you is, what terminals have you encountered that handle SGR 58 with semicolons correctly, but which cause trouble with colons? Because if you haven't, I simply can't understand how you think that semicolons are the better option for your users (specifically in reference to SGR 58).

And just to be clear, I don't actually care what you do in Zellij. I'm not a user, so it doesn't affect me. I'm just trying to get you to understand what I'm actually recommending, because you seem to be misrepresenting my position as an argument for "standards compliance" over real world usage, and that's absolutely not the case.

imsnif commented 3 months ago

I'm just trying to get you to understand what I'm actually recommending, because you seem to be misrepresenting my position as an argument for "standards compliance" over real world usage, and that's absolutely not the case.

I'm sorry you feel I was misrepresenting your position. That was absolutely not my intention. I intentionally did not name anyone in my responses, because I was replying to the general ideas and comments presented in this thread. The part about standards was mentioned by others, and you can scroll up and read it if you'd like to see what I was talking about. I do not want to link to it myself or tag the relevant people because I am doing my best not to make this personal or about anyone's opinions or expertise. Just about the relationship between our two pieces of software.

As for understanding your point: I fully do. I did immediately as you clarified it to my question above. Which is why I took your advice and Zellij will now emit these sequences with colons rather than semicolons: this is the avenue of least damage done to all our users that I as an application developer have control over.

I still see this as a workaround for a Windows Terminal bug as I explained above, and as I also explained above - the fact that this bug exists in other terminals as well does not mean it's not still an issue here.

Surely that classifies as an example of "causing trouble"?

It definitely does. It just does not qualify (in my book) as a "real world example". Every infrastructure has its limitations and design flaws (just open a console in your browser and type in NaN == NaN). Developers find them during development or initial testing and work around them. If they can't work around them easily (in this example make sure the terminal supports styled underlines), and end up with software that causes user-visible bugs, this would qualify as a real world example of a design flaw causing trouble*.

The example being this very issue (as I mentioned in my comment), which I will say again - seems to have been caused on the Windows Terminal side because of strict adherence to standards. I say "seems" because I was not there when this decision was made and did not go through previous PRs to find out. This is simply the justification that I interpreted as being given to me here above. If Windows Terminal (and associated software components) has other reasons for only interpreting colons and asking software using semicolons to fall in line to fix the issues of their joint users, I'd be happy to hear it and try to accommodate in my designs.

*As a side-note: while I understand you have a lot of experience testing the behavior of all sorts of different terminals and we could probably have a very interesting discussion where we analyze different examples and edge-cases that may or may not be relevant here, my personal preference would be not to go down this avenue. The statement I gave was from my own experience (as I mentioned) and that of the xterm maintainers (that I linked to). I feel that going down this path would divert attention from the matter at hand. If you feel differently though, I will try to oblige in the name of fairness to what is slowly becoming a full-blown debate (hopefully a friendly one though).

lhecker commented 3 weeks ago

[...] we're currently rewriting the ConPTY subsystem in https://github.com/microsoft/terminal/pull/17510 to pass data through unchanged, we're not planning on making further changes here.

That work was recently completed and shipped in Windows Terminal Preview 1.22. It should now be a lot more robust when it comes to VT / UNIX.

If you're following this issue, you may have noticed that we reopened it. We had a brief internal debate and felt that we should be more forthcoming to zellij and other applications and be more liberal in what we accept. I'll be adding support for the semicolon version until the next version (1.23) gets released. πŸ™‚