microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
94.85k stars 8.21k forks source link

Reading extended attributes (RGB colors) from the screen buffer #292

Open alabuzhev opened 5 years ago

alabuzhev commented 5 years ago

TL;DR: How?

Scenario: an app needs to read a part of the screen buffer, output something, and then, when needed, restore the original buffer content.

ReadConsoleOutput always worked seamlessly for that, but now Windows supports 256 colours, true colours and underline text, but CHAR_INFO.Attributes is still WORD and obviously can't hold all that luxury.

I remember reading somewhere that you don't have plans to extend output APIs (WriteConsoleOutput and friends) because it can be achieved with VT sequences (with limitations, inconvenience, more work, but, technically - yes, it can), but what about reading?

I tried to search first and found this comment, but I don't think that #57 would help here - no 3rd-party terminals involved in this case, we're reading purely static data.

screen

zadjii-msft commented 5 years ago

You're definitely right that this isn't something that is possible with VT sequences alone. In fact, there aren't any VT sequences (that I'm aware of) that allow the client application to query the contents of the buffer. The entire concept of having a query-able buffer is unique to Windows console applications.

We're trying to move the Windows Console environment to be more akin to that of linux terminals. If you were to write a command-line application that needed to query the buffer contents to work properly, then that application would be pretty much impossible to port to *nix (should the developer ever choose to).

My advice for best-practice when using the console API is to only read input to the commandline application and only write output - DON'T be writing input to the console or reading output from the console. If you follow those guidelines, your applications will end up typically simpler and more portable to other platforms.

Is there a reason you need to read part of the buffer to be able to restore it? Couldn't you reconstruct the output from the data you previously output to the console?

oising commented 5 years ago

As an aside, the VT420 had a DECCRA sequence which could copy rectangular areas of the buffer, but I don't think it was defined in the spec - nor ever implemented in any terminals - to keep the colour information.

I could imagine putting a kind of a "event sourcing" buffer in front of conpty, but that would be a lot of work to create aggregates and snapshots. But a special kind of masochistic fun work nonetheless :)

zadjii-msft commented 5 years ago

@oising That is true - though that seems to only be able to copy the buffer contents to another region in the buffer. It doesn't seem to return the contents to the client application in a usable way. There's another function I came across that IIRC calculated a hash of a buffer region, so that a client application could compare two regions to see if they're the same, or if the contents of a region changed, but again, the client app couldn't reverse engineer the actual contents of the buffer.

oising commented 5 years ago

Certain terminals do support printing the contents, like xterm - so somehow it's being done. It's such a pity we're all being held to this ancient stream oriented way of dealing with the screen. With respect to DECCRA, I guess it could be used to copy the visible contents into the scrollback buffer as a kind of poor man's save/restore of a given area, but you're right - there's no way to serialize it back out.

oising commented 5 years ago

Btw, technically you could use the hashing to test each cell of an area against a pre-computed "rainbow table" of known cell/attribute vales. Disgusting, but feasible.

alabuzhev commented 5 years ago

@zadjii-msft , thanks, that's a very good point.

It is gratifying to hear that all these improvements are not just to support more programs and increase market share, but Windows Console team also has a non-NIH, non-vendor-lock-in attitude and is concerned about porting 3rd-party apps to other platforms :)

However, it's a bit too late to stay pure. 30 years too late actually. Reading the buffer has been supported since forever, as well as other unique features of Windows Console (and that's why we love it). Apps use them and you can't drop anything because compatibility.

So it all comes down to this:

Is there a reason you need to read part of the buffer to be able to restore it? Couldn't you reconstruct the output from the data you previously output to the console?

Ok, why exactly we do this and what exactly we do there: The app is a classic OFM and conceptually can be considered as a command processor with advanced features on top of it (directory listing, text editor, dialogs etc.) It uses the console buffer to render its interface. At some point the user wants to use the "command processor" feature to launch something:

Now, if the launched program decides to use true colours we still can take a snapshot, and it even might be readable (thanks to the approximation you're doing there), but obviously all the colours will be messed up.

eryksun commented 5 years ago

The comparison to the Windows console is a stretch of the imagination, but the Linux console bears mentioning. It's implemented as virtual consoles named /dev/tty[1-63] that emulate a VT102 terminal. From an X11 session, they're commonly accessed via Ctrl+Alt+F[1-6]. /dev/console and /dev/tty0 reference the current virtual console, which is not necessarily the controlling terminal of the current process (i.e. /dev/tty). Of interest to this issue are the /dev/vcs[a][1-63] devices. A vcs[a] device allows reading the screen buffer of the corresponding virtual console, with or without [a]ttributes.

Currently, the Windows console provides access to its input buffer and screen buffers via files on \Device\ConDrv, such as Console (reads from the input buffer, and writes to the active screen buffer), CurrentIn, and CurrentOut. These files are surfaced in the Windows API as the DOS devices CON, CONIN$ and CONOUT$. Special functions are required to write to the input buffer (WriteConsoleInput) and read from the screen buffer (ReadConsoleOutput, ReadConsoleOutputCharacter, ReadConsoleOutputAttribute). I'd like to see an alternative that uses VT streams with generic I/O functions instead of INPUT_RECORD and CHAR_INFO arrays with specialized functions.

Another set of ConDrv files could be added that explicitly support reading and writing of VT sequences as UTF-8 text via ReadFile and WriteFile (but not ReadConsole or WriteConsole) . For the sake of discussion, call these files VirtualTerminal (reads from the input buffer, and writes to the screen buffer), VTCurrentIn, and VTCurrentOut. Surface them as the Windows devices TTY, TTYIN, and TTYOUT, respectively, but require the \\.\ or \\?\ local-device prefix rather than adding legacy DOS devices. To read the screen as VT text, open "\\?\TTYOUT" with read access (or read-write) and call ReadFile. Use the lpOverlapped parameter to set an initial offset, else use the current cursor position.

HBelusca commented 5 years ago

@eryksun : Note that then this means that the conhost has to maintain in memory an extended representation of the buffer that can be read from (and/or written to??) sequentially and that mirrors the very internal buffer (that certainly stores the info in a completely different way) the conhost uses to display in a terminal window.

Re. the condrv files, there are some that are "exposed" in the embedded strings of the driver: "\Input", "\Output", "\Display", "\ScreenBuffer" ; it would be interesting to hear whether they (in particular the last two) couldn't be used for such a purpose.

alabuzhev commented 5 years ago

To read the screen as VT text

@eryksun, reading the screen as VT text is probably the last thing I'd like to do here:

If we're unlucky and ReadConsoleOutputEx will never happen, I'd prefer ReadFile(console_handle, ...) to return an array of fixed-length structures instead, CHAR_INFO-like (character + attributes).

HBelusca commented 5 years ago

As an alternative I don't have an idea whether it would be possible to obtain some kind of shared memory buffer from conhost and available in the (attached) console app that requests it, that translates the contents of the screen.

oising commented 5 years ago

I suspect Microsoft are going to surface their VT parser/buffer they built for conpty at some point in the future when it is stabilized, and this will make life far easier. It needs to get to at least VT420+ level to cover the majority of uses, imo. Oh, and obviously have dot net bindings, and be using spans, pipelines and as much allocation free code they can muster.

miniksa commented 5 years ago

@oising, good suspicion.

zadjii-msft commented 1 year ago

horrifying thoughts:

alabuzhev commented 1 year ago

@zadjii-msft please see a related discussion. TL;DR: we read the output only to save it, write something on top and then restore it, similarly to what cmd.exe does on F7 & F9. We don't really need to see the actual characters, colors, styles etc., so fully opaque APIs or sequences to take / restore / drop snapshots would do fine.

zadjii-msft commented 1 year ago

Oh, for sure. I remember that thread, that had some good solutions. I was mostly jotting down quick notes we had that came up during another discussion. There's probably folks out there where the "layers" / rectangular operations might not work. For them, full rgb data returned is still probably useful. And doing so probably would have been a lot easier than we thought 🤦