Open alabuzhev opened 3 months ago
The recent canary builds include #17510 as an experiment, which is a complete rewrite of how ConPTY works. It now doesn't modify VT output in any way anymore and the translation from console APIs to VT is now direct and strictly "algorithmic" within the console API call handler. This fixes all those "VT output is broken" bugs, and also improves performance ~15x. (If you had performance issues before, you probably
I removed support for surrogate pairs for the console APIs intentionally, as this allows us to move to an architecture that's simpler, more robust, and more performant in the long term. This is because if we restrict console APIs (the non-VT ones) to UCS-2 we can use the CHAR_INFO
based APIs as the basis with which we can abstract all other console calls. ScrollConsoleScreenBuffer
for instance is basically nothing but a series of ReadConsoleOutput
and WriteConsoleOutput
calls.
Since VT doesn't have multiple buffers like the console API, I've written SetConsoleActiveScreenBuffer
as a call to ReadConsoleOutput
followed by a serialization of the CHAR_INFO
structs to VT (which we need anyway to support WriteConsoleOutput
). I've assumed that this is alright since historically the console APIs have only supported UCS-2 until very recently.
It's definitely possible to restore support for surrogate pairs, but I'd strongly prefer if we didn't have to. Otherwise, we would need to translate "CHAR_INFO_UTF32" structs to CHAR_INFO
ones on input/output. Support for grapheme clusters would be even more difficult, but that's what Windows Terminal now supports.
How much does this change impact you? Does this break anything in Far? If you have any feedback, etc., please let me know!
I've just added an early-return to SetConsoleActiveScreenBuffer
that will avoid doing anything if the given handle is identical to the currently active one: https://github.com/microsoft/terminal/commit/21b13ca4615a1407758e5df8bdcde7f7d16792d4
I think your intention was to show the general issue with SetConsoleActiveScreenBuffer
, but thank you so much anyway for showing me that I forgot such an early return!
This will unfortunately break your example code though and it would need to be updated by switching between two different buffers.
I've written
SetConsoleActiveScreenBuffer
as a call toReadConsoleOutput
calls.
Waitaminute... Aren't that officially legacy and "no longer a part of ecosystem roadmap" and "not recommended in new products", according to MSDN banners? 😄
I understand the "implementing legacy in terms of other legacy" logic, but, notably, there are no CHAR_INFO
s anywhere in SetConsoleActiveScreenBuffer
's signature, so from the outside POV it's rather unexpected that it ought to suffer from ReadConsoleOutput
's limitations.
It's definitely possible to restore support for surrogate pairs, but I'd strongly prefer if we didn't have to
I'd argue that at least the case "a single non-BMP character occupying exactly two cells can be losslessly read into a pair" is worth supporting - it covers a huge range of CJK and something is better than nothing, but that's a different topic. Also, it won't help with extended colors and styles, which are also broken by switching buffers now.
Does this break anything in Far?
That's... why I'm here.
Technically yes. Say, a user could choose to have RGB colors and files with non-BMP names on panels. We also have a log sink that sinks logs into a new buffer that's available on demand (via SetConsoleActiveScreenBuffer
). Switching there and back again would uglify everything until explicitly redrawn. Having said that, extra buffers in WT are borderline unusable, so I won't claim that the impact is huge.
I think your intention was to show the general issue with
SetConsoleActiveScreenBuffer
, but thank you so much anyway for showing me that I forgot such an early return!
Thanks, it's a totally reasonable optimization and I haven't suggested it myself exactly not to draw attention away from a bigger issue 😄 This also mostly fixes the issue I initially observed:
ReadConsoleOutput
to take snapshots of the output, produced by external apps, to make it available on demand in the UI.ReadConsoleOutput
s limitations: surrogates, colors, etc.dir
in a folder full of files with funny names, rejoiced, committed and pushed.It turns out that we call SetConsoleActiveScreenBuffer
after executing external apps - an app can switch to some other buffer for reasons and, say, crash before restoring the status quo, potentially leaving the shell in an unusable state, so we do it just in case. In other words, it is probably no-op 99.999% of the time.
Thinking of it, can't SetConsoleActiveScreenBuffer
use something like DECCRA
internally too instead of ReadConsoleOutput
?
@lhecker
Aren't that officially legacy and "no longer a part of ecosystem roadmap" and "not recommended in new products", according to MSDN banners? 😄
To be fair, the same banner exists on the SetConsoleActiveScreenBuffer
page, so I feel like it's fine if using that API results in a worse (modern) Unicode correctness.
I understand the "implementing legacy in terms of other legacy" logic, but, notably, there are no
CHAR_INFO
s anywhere inSetConsoleActiveScreenBuffer
's signature, so from the outside POV it's rather unexpected that it ought to suffer fromReadConsoleOutput
's limitations.
That's true and we could make this work. However, I'd prefer if we could avoid doing that for a year or two, until after I completed some architectural changes for conhost.¹ Would this be alright with you, now that SetConsoleActiveScreenBuffer
will be a no-op when the active buffer hasn't changed? I'd of course make it work earlier if this becomes a blocking/major issue for you or anyone else.
Thinking of it, can't
SetConsoleActiveScreenBuffer
use something likeDECCRA
internally too instead ofReadConsoleOutput
?
I think we could try that long term, but not initially. ConPTY needs to also work with terminals that don't support paging operations (PPA, PPR, etc.) after all. I think it's important that we gain some experience with this "fallback code" first.
¹ The changes are explained in detail in #17387. But basically, my goal is to make Windows Terminal its own proper console server, just like conhost is now.
To do so I'm creating an API that can represent all console APIs without loss of any information. SetConsoleActiveScreenBuffer
for instance would not be mapped to VT with this API. This would allow Windows Terminal to properly support all console APIs, including switching screen buffers and not losing any complex Unicode, setting the 16 base colors with SetConsoleScreenBufferInfoEx
, and so on.
ConPTY would also be implemented on top of this API, but it would necessarily lose some of the information, because it's output will still be strictly VT based. However, since such a ConPTY would only use its text buffer to serve ReadConsoleOutput
calls, its text buffer could theoretically be a simple CHAR_INFO[height][width]
matrix (= well over 10k lines less code needed). I'm not entirely sure if we'd ever do that (= 2 buffer implementations = also bad), but I think it'd be nice to have that option. I figured the best way to discover if we have the option is by breaking it now and seeing if it results in any issues. It did in your case, but as it turns out we could solve it with the no-op early return. So, if possible, I'd like to wait again and see if there's another issue.
Would this be alright with you, now that
SetConsoleActiveScreenBuffer
will be a no-op when the active buffer hasn't changed?
Of course - as I mentioned, I noticed only because it wasn't a no-op and because I was intentionally checking the correctness of a piece, not supported by ReadConsoleOutput
.
In normal circumstances it shouldn't matter that much.
This will unfortunately break your example code though and it would need to be updated by switching between two different buffers
Thanks, updated it.
Windows Terminal version
1.22.1972 and newer
Windows build number
10.0.19045.4412
Other Software
No
Steps to reproduce
int main() { const auto Output = GetStdHandle(STD_OUTPUT_HANDLE); DWORD n; WriteConsole(Output, L"𠜎𠝹𠱓𠱸𠲖𠳏𠳕𠴕𠵼𠵿𠸎\n", 23, &n, {}); WriteConsole(Output, L"💖💫🐶🐇🌎🏃\n", 13, &n, {}); MessageBox({}, L"Press OK", {}, 0); const auto NewBuffer = CreateConsoleScreenBuffer(GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, {}, CONSOLE_TEXTMODE_BUFFER, {}); SetConsoleActiveScreenBuffer(NewBuffer); SetConsoleActiveScreenBuffer(Output); MessageBox({}, L"See?", {}, 0); }