Open alabuzhev opened 3 years ago
@miniksa for the response
Okay, so here's what I said during triage that the team told me to put here:
ReadConsoleOutput*
family of functions. We believe that the "readback" scenario, being not translatable into VT sequences, will inhibit us from our long-term goals for command-line applications and hosts/terminals on Windows. Those goals primarily center around moving the Windows platform toward as much of VT sequences as possible (translating older apps when necessary for compatibility purposes on their behalf) and systematically trying to cut out the designs of the legacy conhost platform that prevent us from achieving our performance and reliability goals (like the services conhost was providing to Windows CLIs such as automatic history, aliases, and mutable buffers).CHAR_INFO
as a single grid position with a fixed-size character data and fixed-size color data. Of course as we've advanced the platform in the last bunch of years, we've realized that both of those constraints are unsuitable for modern terminal applications (due to surrogates or complex unicode combining sequences not fitting in a single cell on the character side and full RGB color data not fitting in a fixed-size WORD
attribute). Remarks
section on docs.microsoft.com
that explained how a 80-cell wide buffer that used to be ReadConsoleOutputW
-able by allocating 80*sizeof(CHAR_INFO)
and passing it in... may now have to detect something and give us MORE than 80 cells to account for surrogate pairs. Or read some overloaded indicator to then go read more data another way. Or write a ReadConsoleOutputExW
method which wouldn't be backported downlevel.ReadConsoleOutputCharacter
API. I think we could pretty safely say something like "When in Codepage 65001 for UTF-8... (which was never supported properly anyway in the grand history of things...), use this API in a double-call format. Provide a null buffer pointer on lpCharacter
and tell us the start position in dwReadCoord
and the number of cells in nLength
that you wish to read. In lpNumberOfCharsRead
, we will tell you how large the buffer will need to be to hold all the UTF-8 characters. Then call again and we'll fill it up. That doesn't really help you map it to cells yourself, but it's definitely less complicated for us to change and doesn't require publishing new API surfaces or structures (which is a big challenge especially on something we're trying to get away from.) conhost
is in this repository.Thank you for the explanation, I agree that it's a PITA from a dev perspective.
The ReadConsoleOutputCharacter
API doesn't sound like a viable alternative for our workflow: we need not only the text, but the whole picture: text, its positioning, colours, style attributes etc.
We need it to provide a terminal session look & feel inside a file manager. To see what I mean:
Obviously, to be able to bring back that output by demand we need to read it from the console once the command finishes executing. Once we have that snapshot we can draw the UI controls (e.g. panels) on top of it and continue. Later, if the user wishes to see it, we flush the saved output back to console. Once the user executes something else we take a new snapshot.
This design pattern (instant access to the output) has been around since Norton Commander / the 80s and is ridiculously convenient.
Now compare it with a system that historically doesn't have any "readback" notion:
Without readback you can have either the UI or the terminal, but not both simultaneously. Of course, it's not the end of the world, it's still usable, but way less convenient.
As I mentioned earlier, we don't really care what ReadConsoleOutput
returns: we don't do anything with those CHAR_INFOs, only save all of them and write back later on demand.
Given that, the ideal API for this particular scenario would've probably been:
HANDLE TakeShapshot(SMALL_RECT Region)
- take a full snapshot of the screen buffer and return an opaque handle to it. It would be ideal if it could work with an arbitrary Region
within the buffer, including the scrollback, but the viewport only is fine if that's not possible for some reason.void PutSnapshot(HANDLE Snapshot, COORD Position)
- write the previously taken snapshot back to the buffer, cropping if needed (e.g. if the user changed the buffer size between the calls). Same as above - it would be ideal if it could put it into any Position
within the buffer, including the scrollback, but the viewport only is fine if that's not possible for some reason.void DeleteSnapshot(HANDLE Snapshot)
- delete the snapshot that is no longer needed.Alternatively, TakeShapshot
/ PutSnapshot
/ DeleteSnapshot
concepts could be ESC sequences if that's easier to implement (no need to go through kernel32 etc.).
With this approach it would be possible not only to keep the existing functionality, but also to support all the existing and future extensions (true color, text styles, surrogates etc.) without exposing any new structures or any implementation details.
Aha. That is interesting now to better understand the scenario.
One of the troubles right now I'm having with the "passthrough" mode that I made during our fix-hack-learn week a few weeks ago is that the "popups" that occur in a "Cooked Read" (like when CMD.exe asks conhost to do the edit line on its behalf... as opposed to Powershell that does it itself with the raw reads. Press the F keys in CMD.exe after you've run a few commands to see a popup in progress). We have the exact same problem there. We want to backup a region of the screen, draw over the top of it, and then dismiss it and have what was behind it get restored. I wonder if there's anything in https://invisible-island.net/xterm/ctlseqs/ctlseqs.html that would help us here if we implemented it.
When I briefly discussed this particular problem of having an "overlay" sort of layer that can be non-destructively dismissed... We were thinking it is sort of like requesting a move to an "alternate screen buffer" except one where we copy the main buffer to the alternate on entry, destructively draw over the alternate, then pop back to the main one when the "overlay" is complete. I'm not sure there is a VT sequence for "copy main into alternate while entering" but we should perhaps dig around and see if we can find anything close to this paradigm and if not, engage with some of the other terminals out there to see either how they handle this or if we can all agree on a reasonable extension.
The cropping if the user resizes is also interesting as well. I'm not sure that my solution immediately covers that.
Yes, it's the same idea as those F2/F4/F7/F9 popups.
Also, TakeShapshot
/ PutSnapshot
/ DeleteSnapshot
relatively nicely fit into the existing API:
// Take a snapshot of the current viewport
HANDLE WINAPI CreateConsoleScreenBuffer(
DWORD DesiredAccess, // Ignore
DWORD ShareMode, // Ignore
const SECURITY_ATTRIBUTES *Attributes, // Ignore
DWORD Flags, // CONSOLE_SNAPSHOT_BUFFER
LPVOID ScreenBufferData // Ignore
);
// Write the taken snapshot into the current viewport
BOOL WINAPI SetConsoleActiveScreenBuffer(
HANDLE hConsoleOutput // A handle to a CONSOLE_SNAPSHOT_BUFFER
);
// Discard the allocated snapshot
BOOL CloseHandle(
HANDLE hObject // A handle to a CONSOLE_SNAPSHOT_BUFFER
);
Yes, it's the same idea as those F2/F4/F7/F9 popups.
Also,
TakeShapshot
/PutSnapshot
/DeleteSnapshot
relatively nicely fit into the existing API:// Take a snapshot of the current viewport HANDLE WINAPI CreateConsoleScreenBuffer( DWORD DesiredAccess, // Ignore DWORD ShareMode, // Ignore const SECURITY_ATTRIBUTES *Attributes, // Ignore DWORD Flags, // CONSOLE_SNAPSHOT_BUFFER LPVOID ScreenBufferData // Ignore );
// Write the taken snapshot into the current viewport BOOL WINAPI SetConsoleActiveScreenBuffer( HANDLE hConsoleOutput // A handle to a CONSOLE_SNAPSHOT_BUFFER );
// Discard the allocated snapshot BOOL CloseHandle( HANDLE hObject // A handle to a CONSOLE_SNAPSHOT_BUFFER );
- The external impact is one new SDK constant for the new buffer type.
Dude. I love this. This sounds completely doable. Adding one constant to an existing flag field and not changing the method signatures at all is the level of risk I'll happily take with the existing API surface. We would just also need to figure out how to represent this as VT or make it analogous to VT such that we could co-exist with other Terminals and the Linux world and I think we have a spec.
We want to backup a region of the screen, draw over the top of it, and then dismiss it and have what was behind it get restored. I wonder if there's anything in https://invisible-island.net/xterm/ctlseqs/ctlseqs.html that would help us here if we implemented it.
FYI, there is actually a way to read the screen to some extent using VT sequences - it's just very inconvenient, and for those terminals that do support it, it's often disabled because it's considered a security risk.
Essentially what you do is use the DECRQCRA
command to request a checksum of the screen, but limit it to one cell at a time. I haven't really played with this much, so I'm not sure to what extent it works with Unicode, but in theory you should be able to determine the underlying character from the returned checksum.
As for reading attributes, XTerm added the XTCHECKSUM
sequence to configure the form of checksum that DECRQCRA
produces, and I think that enables it to include the cell attributes in the checksum, which in theory would let you to determine the attributes of a given cell (again I haven't actually tested this).
Obviously you can see this wouldn't be a very efficient way to read the entire screen, so I'm not sure it's a practical solution. But I thought it was worth mentioning, because if DECRQCRA
is considered a security risk, then assumedly any other sequence we invent to read the screen would be just as risky.
Yeah @j4james... I agree both on the efficiency and security points. That's why I'm excited to understand the scenario as a sort of "keep a backup for me, let me draw on top as an overlay, then restore it." I think the better route to follow is to allow for an alternate screen buffer that copies the main one on entry so it can be destructively painted then restored by returning to the main one.
tl;dr for folks coming to the thread late:
We discussed this a bit more yesterday. The "create a screen buffer which is a copy of the current one" seemed like a good consensus. Whether that's with CreateConsoleScreenBuffer(..., CONSOLE_SNAPSHOT_BUFFER, ...)
/SetConsoleActiveScreenBuffer
or some custom VT sequence for "enter the alternate buffer (and initialize the contents of the alt buffer to the contents of the main buffer)" seemed like a minor semantic difference. Maybe something like \x1b[?9049h/l
, since we've already used ?9001h/l
in the past. I honestly don't know which would be easier to prototype, I know I'd probably start with the VT version, if that's something that would work for this.
If that's something that might work for Far, we can try and prototype something here.
@zadjii-msft Thanks for getting back to this.
To summarise, this is what we do now:
βββββββ
βStartβ
ββββ¬βββ
βββββββββββββββββββββββββββββββββββββΊβ€
β βΌ
β βββββββββββ΄ββββββββββ
β β Take a snapshot β
β β of the viewport β
β β(ReadConsoleOutput)β
β βββββββββββ¬ββββββββββ
β βΌ
β βββββββ΄ββββββ
β βDraw the UIβ
β βββββββ¬ββββββ
β ββββββββββββββββββββββββββββββββββββ
β βΌ β
β βββββββββββ΄ββββββββββ β
β ββββββββββββ€Wait for user inputββββββββββ β
β β βββββββββββββββββββββ β β
β βΌ βΌ β
β ββββββββββββ΄βββββββββββ βββββββββββββββββ΄ββββββββββββββ β
β βUser enters a commandβ βUser wants to see (a part of)β β
β ββββββββββββ¬βββββββββββ β the output under the UI β β
β β βββββββββββββββββ¬ββββββββββββββ β
β βΌ βΌ β
β βββββββββββββββ΄βββββββββββββ βββββββββββββββ΄βββββββββββ β
β β Write the saved snapshot β βRearrange the UI layers β β
β β to the console β β internally so that β β
β β Execute the process β β the snapshot is β β
β βThe process writes output β β (partially) visible β β
β β to the console and exits β β Write the merged image β β
β βββββββββββββββ¬βββββββββββββ β to the console β β
β β ββββββββββββ¬ββββββββββββββ β
β β β β
βββββββββββββββββ ββββββββββββββββββββ
This worked nice for decades, but now those CHAR_INFOs can't represent all the information on the screen (because of new colours and text styles) and also there are issues with surrogates like the described above, so reading and writing them back is no longer lossless.
With opaque "snapshots":
βββββββ
βStartβ
ββββ¬βββ
βββββββββββββββββββββββββββββββββββββΊβ€
β βΌ
β βββββββββββ΄ββββββββββ
β β Take a snapshot β
β β of the viewport β
β β(opaque, VT or API)β
β βββββββββββ¬ββββββββββ
β βΌ
β βββββββ΄ββββββ
β βDraw the UIβ
β βββββββ¬ββββββ
β ββββββββββββββββββββββββββββββββββββ
β βΌ β
β βββββββββββ΄ββββββββββ β
β ββββββββββββ€Wait for user inputββββββββββ β
β β βββββββββββββββββββββ β β
β βΌ βΌ β
β ββββββββββββ΄βββββββββββ βββββββββββββββββ΄ββββββββββββββ β
β βUser enters a commandβ βUser wants to see (a part of)β β
β ββββββββββββ¬βββββββββββ β the output under the UI β β
β β βββββββββββββββββ¬ββββββββββββββ β
β βΌ βΌ β
β βββββββββββββββ΄βββββββββββββ βββββββββββββββββ΄βββββββββββββ β
β βRestore the saved snapshotβ β Restore the saved snapshot β β
β β (opaque, VT or API) β β (opaque, VT or API) β β
β β Execute the process β βWrite UI layers on top of itβ β
β βThe process writes output β β but only where needed β β
β β to the console and exits β βββββββββββββββββ¬βββββββββββββ β
β βββββββββββββββ¬βββββββββββββ βββββββββββββββββ
β β
βββββββββββββββββ
Essentially it's almost the same workflow, except for the bottom right corner, describing the case when we want to display both the output and the UI simultaneously (see the screenshot above). It is a little more complicated than the other branch because currently we don't write anything to the console directly, but into a CHAR_INFO-ish memory buffer instead and then flush it when needed, not caring what is inside. With the new approach we will need to do some extra work to preserve the (parts of) the snapshot we've just restored and write only precisely where needed, but that's not rocket science and I'm happy to try it.
I was just doing some research on the rectangular editing operations (DECCRA
, DECFRA
, DECERA
, etc.), and it occurred to me that these functions may provide a solution to the layering problem when combined with the paging APIs discussed in #13892.
The key point to note is that the Copy Rectangular Area operation (DECCRA
) takes a source and destination page number. So if you want to save an area of the screen, and restore it later, you should just be able to copy it to a background page, and then copy it back again when you want it restored.
I need to do some testing on terminals that actually implement these operations to see if they work the way I think they do, but they sound promising.
I've had a chance to test this now on a couple of different terminal emulators, and most work really well. MLTerm had issues restoring the "blank" parts of the screen, if you're tried to use an overlay immediately after startup, but once the first page had scrolled out of view it seemed to work fine.
Here's a short demo from one of the terminals I tested.
https://user-images.githubusercontent.com/4181424/189869500-075cd840-9c4e-43b0-ae46-2ae3c118913f.mp4
Wow that's a GREAT solution. That sounds like the perfect way to save off a section of the buffer and restore it. I suppose some subprocess could hijack the page that you stashed buffer contents into, but meh? That seems like something that's worth at least exploring more. Seems like a way easier way to try to do overlays in a shell.
Windows Terminal version (or Windows build number)
10.0.19043.1110
Other Software
No response
Steps to reproduce
Compile and run the following code:
Expected Behavior
Actual Behavior
Observations
The problem is here: https://github.com/microsoft/terminal/blob/3f5f37d910a2c8fde3bb5123c9516f5f9e38af2b/src/host/directio.cpp#L848
AsCharInfo calls Utf16ToUcs2 for the same string over and over: https://github.com/microsoft/terminal/blob/3f5f37d910a2c8fde3bb5123c9516f5f9e38af2b/src/host/consoleInformation.cpp#L408
which returns UNICODE_REPLACEMENT: https://github.com/microsoft/terminal/blob/3f5f37d910a2c8fde3bb5123c9516f5f9e38af2b/src/types/convert.cpp#L172
Additionally, AsCharInfo incorrectly sets COMMON_LVB_LEADING_BYTE in this case: https://github.com/microsoft/terminal/blob/3f5f37d910a2c8fde3bb5123c9516f5f9e38af2b/src/host/consoleInformation.cpp#L415
I assume that the host might use
Attribute::Leading
andAttribute::Trailing
internally for its own purposes, butCOMMON_LVB_LEADING_BYTE
&COMMON_LVB_TRAILING_BYTE
have a different purpose, not applicable in this case and shouldn't leak into the public interfaces for surrogates.A possible fix
P.S. I remember that there's #8000 that should improve the situation with surrogates in general. This one is about the API issue: when a surrogate pair occupies two columns, there's really no reason to return two replacement characters if we could return the original pair. The situation with a pair that occupies one column is less obvious, but that's a different issue.