microsoft / terminal

The new Windows Terminal and the original Windows console host, all in the same place!
MIT License
95.57k stars 8.31k forks source link

ConPTY does not pass Device Control Strings (DCS) to 3rd-party terminals #17313

Closed acarl005 closed 3 months ago

acarl005 commented 5 months ago

Windows Terminal version

1.19.11213.0

Windows build number

10.0.22631.0

Other Software

No response

Steps to reproduce

We're thrilled by the release and development of ConPTY. We're hoping it can support our usage of Device Control Strings.

From any 3rd-party terminal that wishes to "speak ANSI" using ConPTY, run any command that outputs a DCS escape sequence. (I'm using a locally built alacritty)

PowerShell 7.4.2
echo "`eP <some data for terminal> `e\"

Alacritty does not receive the DCS.

Expected Behavior

I expect DCS escape sequences to be output by ConPTY so that 3rd-party terminals may use this to implement more advanced features.

Context

I work for Warp and we'd like to bring Warp Terminal to Windows. Warp depends heavily on DCS strings for its key features.

Alternatives Considered

We tried utilizing a custom OSC sequence instead. We notice that unhandled OSC gets flushed to the terminal. Although with this method Warp is able to receive these data as we'd like, the ordering is not preserved, i.e. if the shell outputs an OSC interleaved with text, when it is read out of ConPTY by the terminal, the OSC is not received in the same order in which it was produced. Warp does depend on that ordering, which is an invariant on UNIX PTYs.

It would be nice if DCS had similar logic where _pfnFlushToTerminal was called like it is for OSC, but where ordering is preserved.

Actual Behavior

Only DCS escape sequences that Windows Terminal recognizes are being dispatched. As we understand it, ConPTY is an interface intended to be used by 3rd-party terminals for optimal compatibility. Therefore, it makes sense to expand support for arbitrary DCS sequences that other terminals might depend on.

github-actions[bot] commented 5 months ago

Hi I'm an AI powered bot that finds similar issues based off the issue title.

Please view the issues below to see if they solve your problem, and if the issue describes your problem please consider closing this one and thumbs upping the other issue to help us prioritize it. Thank you!

Closed similar issues:

Note: You can give me feedback by thumbs upping or thumbs downing this comment.

lhecker commented 5 months ago

Did you use the system ConPTY API? Due to the lengthy release cycle of Windows that API is usually quite outdated. We're currently experimenting with shipping ConPTY as a nupkg. You can find a build here: https://terminalbuilds.blob.core.windows.net/builds/Microsoft.Windows.Console.ConPTY.1.19.240130002.nupkg That's the same build that WezTerm uses. There was a discussion about it here: https://mas.to/@DHowett/111909543700190300 The .nupkg contains a prebuilt conpty.dll (= ConPTY client) as well as a v1.19 build of OpenConsole.exe (= ConPTY server).

To be fair, I haven't tried your repro, so my suggestion may be pointless. πŸ™ˆ However, even if that nupkg doesn't fix the issue, you'd still need to use a bundled OpenConsole.exe version to receive an updated version later on.

But the good news is that I'm currently working on replacing the complex ConPTY setup we have. Afterwards it'll pass VT text from your shell, etc., 1:1 directly to the terminal. The bad news is that I am currently working on it. If the above .nupkg doesn't work, and I'm not sure if you can share your timeline of shipping Warp on Windows publicly, but if it's still a while out, you could consider waiting for me to finish my work on that.

abhishekp106 commented 5 months ago

Hi @lhecker thanks for taking the time to respond! I'm another engineer at Warp, working with @acarl005.

Did you use the system ConPTY API?

We followed the instructions outlined in the Mastodon thread you linked but unfortunately using the newer ConPTY build does not solve the issue. Specifically, I still see that DCS strings are not getting sent from ConPTY and that the custom OSCs we use as an alternative are sent out of order.

However, even if that nupkg doesn't fix the issue, you'd still need to use a bundled OpenConsole.exe version to receive an updated version later on.

Noted! We'll ship with these newer bundles instead of relying on the system API.

Afterwards it'll pass VT text from your shell, etc., 1:1 directly to the terminal.

This is exactly what we need, looking forward to this releasing! πŸ₯³ Many of the features we use would depend on this particular part of the PTY since we rely on it to implement many of our key features, like blocks. We can't meaningfully test most of the core functionality of Warp without this. Do you have a rough timeline (weeks/months/quarters) of when this might get shipped? Our intention is not to rush you but just to plan around it!

lhecker commented 5 months ago

Do you have a rough timeline (weeks/months/quarters) of when this might get shipped?

I think a month is a fair estimate for this. I don't think it'll take multiple months. I'm currently working on a prototype, as well as the design spec based on that. After the design has been approved, I'll have to finish implementing the more complex parts of ConPTY within the new architecture. But if my proposal is rejected, I'll happily implement a workaround for you anyway. πŸ™‚

acarl005 commented 5 months ago

@lhecker Awesome! We're happy to hear someone is working on this. May I ask about order preservation in your planned implementation? As for Warp's requirements, we use DCS for things like delimiting our "blocks" and so ordering is an important for our case, e.g. if a process emits something like...

echo "some output `eP <some data for terminal> `e\ some other output"

...then we depend on some output ESCP <some data for terminal> ESC\ some other output to come out of the PTY.

abhishekp106 commented 5 months ago

@lhecker We'd also love to try out the prototype you are building, if you can easily share it! We can try it from a remote branch, a specific build, or anything else that's convenient.

lhecker commented 5 months ago

May I ask about order preservation in your planned implementation?

If you have ENABLE_VIRTUAL_TERMINAL_PROCESSING enabled (I'm sure you do πŸ˜„), then the text will not be touched by ConPTY at all anymore. Warp will receive the VT output entirely unmodified. (It'll go through UTF-16 <> UTF-8 conversion, because our VT parser uses UTF-16, but I'm sure that's beside the point.)

We'd also love to try out the prototype you are building, if you can easily share it! We can try it from a remote branch, a specific build, or anything else that's convenient.

I'm not at a point yet where you could rely on it, but my WIP branch is here: https://github.com/microsoft/terminal/tree/dev/lhecker/goodbye-vtengine If you'd like to try it and need help building OpenConsole and conpty.dll, let me know.

The design spec is here: https://github.com/microsoft/terminal/blob/dev/lhecker/goodbye-vtengine/doc/specs/%2313000%20-%20In-process%20ConPTY.md This issue and your problem will be solved after Step 1 in that design doc already, so you can ignore the rest if you'd like. (Although you're likely to benefit from the remaining 3 steps, since it would significantly improve your performance.) I do have to say again though that everything within the doc still needs to go through review, nothing is final, and the entire thing or parts of it may be scrapped if we find that it's impossible to achieve.

alokedesai commented 5 months ago

@lhecker It sounds like this is essentially "passthrough mode", is that correct?

Please let us know if Warp engineers can help with this migration in any way. We would likely be able to commit at least one engineer to this effort to help get it launched.

lhecker commented 5 months ago

@alokedesai Not in the original sense.

(Some more context for those who may not be super familiar with ConPTY's architecture:) ConPTY parses any VT output of any application and stores the results in a local buffer, so that later calls to the ReadConsole* family of functions can read that output back. This is for instance used by Far Manager to backup and restore the viewport contents. VtRenderer is currently used to "render" those buffer contents back to VT output. This is why the order of operations is lost: VT output gets parsed, thrown away, and then regenerated.

Passthrough mode used to be the idea that we could stop parsing the VT output of any application that promises to only ever use VT and not call anything except for Read/WriteFile or Read/WriteConsole (= primitive text only). We would thus not need to transform any complex Console API calls and be able to pass the VT output directly through to the terminal. This would preserve the order of operations and increase performance, because output would not need to be parsed nor "rendered".

My upcoming proposal on the other hand is to always parse the output, despite the cost. I've improved the performance of our text buffer by >10x over the last 3 years so it's not as much of an issue anymore. My above branch for instance gets ~100MB/s with cat despite the double-parsing, which is not great, but also not terrible. This will allow us to always serve the ReadConsole* calls. Next, my idea is to transform complex console API calls like WriteConsoleOutputCharacter into a series of VT sequences directly (purely algorithmically right within the console API implementation). Lastly, to help terminals not be bottlenecked by the double-parsing, I'm proposing a nano-COM API for ConPTY where it runs within your process, you give it access to your buffer, and it will provide you with a genuine 0-overhead console server in exchange. Windows Terminal would use such an in-process ConPTY without double-buffering, while something like sshd would continue to use a ConPTY that keeps a buffer (since sshd runs remotely).

alokedesai commented 5 months ago

(Some more context for those who may not be super familiar with ConPTY's architecture:) ConPTY parses any VT output of any application and stores the results in a local buffer, so that later calls to the ReadConsole* family of functions can read that output back. This is for instance used by Far Manager to backup and restore the viewport contents. VtRenderer is currently used to "render" those buffer contents back to VT output. This is why the order of operations is lost: VT output gets parsed, thrown away, and then regenerated.

Passthrough mode used to be the idea that we could stop parsing the VT output of any application that promises to only ever use VT and not call anything except for Read/WriteFile or Read/WriteConsole (= primitive text only). We would thus not need to transform any complex Console API calls and be able to pass the VT output directly through to the terminal. This would preserve the order of operations and increase performance, because output would not need to be parsed nor "rendered".

Gotcha, that makes sense. Agreed that an approach that still supports the traditional Console APIs is strictly better

My upcoming proposal on the other hand is to always parse the output, despite the cost. I've improved the performance of our text buffer by >10x over the last 3 years so it's not as much of an issue anymore. My above branch for instance gets ~100MB/s with cat despite the double-parsing, which is not great, but also not terrible. This will allow us to always serve the ReadConsole* calls. Next, my idea is to transform complex console API calls like WriteConsoleOutputCharacter into a series of VT sequences directly (purely algorithmically right within the console API implementation).

I see. So TLDR; whereas previously the VTEngine would asynchronously output escape sequences based on what changed in the output buffer we'll now synchronously parse the output, update the output buffer, and then forward all of the escape sequences to the backing terminal. Is that correct?

alokedesai commented 5 months ago

@lhecker Really excited to see this happen, if Warp engineers can help out with this in anyway please let us know.

This seems like a massive rearchitecture of ConPTY and we would be able to contribute headcount if it would accelerate the process of launching this.

lhecker commented 5 months ago

Is that correct?

As a tl;dr, absolutely!

This seems like a massive rearchitecture of ConPTY and we would be able to contribute headcount if it would accelerate the process of launching this.

Thanks! I'll definitely keep this in mind. πŸ™‚ For now, however, it needs to be approved by the team and probably also run by our division architects, because all 4 steps in the spec together are a huge change that will likely take >1 year to complete.

If you're massively blocked by this issue, and if you have dev time to spare, you could consider hotfixing this issue yourself. I believe what you need is something like this: https://github.com/microsoft/terminal/blob/baba406a074e792dd4486f4239ac4b3292f95a60/src/terminal/adapter/adaptDispatch.cpp#L3762-L3767

If that's correct, then you then need to call out to _dispatch here: https://github.com/microsoft/terminal/blob/baba406a074e792dd4486f4239ac4b3292f95a60/src/terminal/parser/OutputStateMachineEngine.cpp#L717-L756

and do the same TriggerFlush. I suspect that this will fix the issue.

alokedesai commented 5 months ago

Great thank you, we'll try that out! Will that suggested hotfix also solve the ordering problem? (My assumption is no but just confirming).

And to clarify, is sending DCS to the terminal + fixing the ordering problems still something you expect to take 1 month? I assume productionizing the server and shipping it in windows are what would cause this to take 1 year+?

j4james commented 5 months ago

Just FYI, I made a conscious decision not to pass through unrecognised DCS sequences, precisely because they typically won't work, in the same way that unrecognised OSC sequences don't work. All we'll be doing is introducing another set of bug reports for third party terminals, who'll now think they have support for DCS sequences, but which will inevitably fail in unpredictable ways.

If you follow other terminal bug trackers, you can see the kind of problems this causes. And app devs are affected by the issue as well, because instead of a particular sequence simply not being supported on Windows, it appears to work, but has subtle bugs. And they waste a whole lot of time trying to understand and work around those bugs, when it's really not something they can fix.

So if you want DCS support, and want it to actually work, I'd suggest you try and get OSC sequences to work reliably first (assuming you think that's feasible). Otherwise I think we'll just be making things worse for everyone by introduced broken DCS sequences into the mix.

Another option to consider would be adding a pass-through wrapper sequence like some multiplexers support (e.g. see the tmux FAQ). After all, the current implementation of conpty is essentially a kind of multiplexer.

But really I think the best bet is to wait for lhecker's new architecture. If it turns out that isn't going to work, we can always revisit other options later.

abhishekp106 commented 5 months ago

@lhecker What's the simplest way to build your prototype from dev/lhecker/goodbye-vtengine? I'm assuming that all I need is to successfully build contpy.dll and OpenConsole.exe.

When I run the PowerShell function Invoke-OpenConsoleBuild from Windows Terminal I get a lot of errors, many of them related to tests. I assume that this is because it's trying to build every project that is present. (I imagine I don't need to refactor the test cases to get it to work).

I am able to successfully get the conpty.dll file built by using Visual Studio and only checking some of the Projects that are have conpty in the name. How can I get OpenConsole.exe as well? What is the minimum set of projects I need to include to get what I need? I tried checking off all non-test related projects but I get the following errors:

image

Happy to fix some of the smaller errors like type conversions or an incorrect number of arguments, but I figured I should check on what projects I actually need.

Here's the Visual Studio Build > Manage Configuration modal I'm modifying: image

alokedesai commented 5 months ago

@j4james By "getting OSCs to work reliably" do you mean ensuring they are sent to the terminal in an ordered way?

If so, very much agreed. We tried OSCs when we realized DCSs were swallowed by ConPTY and it didn't work because of the ordering issues. Regardless of which sequence we use, the biggest requirement we have is that we must be able to communicate from shell --> terminal in a synchronous way.

lhecker commented 5 months ago

What's the simplest way to build your prototype from dev/lhecker/goodbye-vtengine?

It's a different branch but I did a quick screen recording for how I do it:

https://github.com/microsoft/terminal/assets/2256941/6d204afe-fbaf-493c-8d28-b70bc0802f88

I tried checking off all non-test related projects but I get the following errors:

When it comes to OpenConsole and conpty.dll: I just pushed a commit that fixes the build error. Instead of return !inPtyMode; it should've just said return true;.

j4james commented 5 months ago

By "getting OSCs to work reliably" do you mean ensuring they are sent to the terminal in an ordered way?

@alokedesai That's one issue. Another problem is that the position of the cursor when you receive the sequence won't necessarily be the position of the cursor when it was sent, so you can't depend on that. And if you change the cursor position yourself in response to a passed-through sequence, that has the potential to misposition any subsequent output. Or if you produce any output yourself in response to a passed-through sequence, that will potentially be overwritten by the conpty renderer at a later stage.

In short, there are very few scenarios that can reliably work with a passed-through sequence if conhost isn't specifically designed to handle that sequence. As a rule of thumb, if you can't make it work in tmux, it's unlikely to work through conpty.

alokedesai commented 5 months ago

@j4james @lhecker Thanks for the additional info.

Warp blocks work by sending a custom DCS from shell --> terminal when a command starts and finishes executing (see a blog post about this here).

Whenever a command finishes, we create a new grid so that contents from each command are fully isolated from each other.

Given that background, I don't think a fork of ConPTY that adds support for a custom OSC (or DCS) would fix the issues we're seeing. This is because ConPTY would have to actually update its own internal data model to reflect the fact that Warp is going to create a new grid, which I think is impossible to do perfectly (but I may be overestimating the complexity of this).

Assuming my understanding is correct (I'd love confirmation of that / whether there's an obvious solution I'm missing) I believe we have to wait for changes outlined in step 1 of that design doc to land. Is ~1 month still accurate?

Happy to Zoom about this if easier, I believe we chatted with some of your coworkers in the past (@DHowett and Carlos Zamora).

acarl005 commented 5 months ago

And if you change the cursor position yourself in response to a passed-through sequence, that has the potential to misposition any subsequent output

Yeah, by starting a new "block" we're basically starting a new grid, which sets the cursor position back to (0, 0). This puts us out of sync with the console output buffer unfortunately.

alokedesai commented 4 months ago

Hi @lhecker, following up here one more time. I noticed that the original design spec you posted no longer exists.

Is this something you're still working on?

DHowett commented 4 months ago

(Replied via e-mail!)

zadjii-msft commented 4 months ago

(for everyone not on the mail thread (me too), the spec is in #17387)

lhecker commented 4 months ago

Yep, exactly, the spec moved over to a PR now, while I continue to implement the new ConPTY. I'm currently working on a new "cooked read" implementation (= Windows' readline-like implementation for scanf and friends). The last remaining API I need to implement is SetConsoleActiveScreenBuffer, and afterwards I need to fix any remaining bugs. The good news is, we're definitely on track. But it's also not an easy change, so I'll unfortunately be largely unreachable for a while longer while I focus on completing this work.

alokedesai commented 3 months ago

Hi @lhecker, thanks for all of your hard work on this. Really exciting to see some of the PRs go by.

I'm trying to better understand the end-state after this work is complete: under what circumstances will ConHost rely on the output buffer and try to read-back what's changed to send VT sequences to the terminal?

I originally was under the impression that this change would completely remove the need for ConHost buffer, but I realize that's not possible because it it would break console functions that require reading the buffer (such as GetConsoleCursorInfo).

This has an impact on Warp because our data model isn't a simple grid, so I'm trying to think through if the new architecture would actually fully solve our requirements.

lhecker commented 3 months ago

ConPTY, the way it works right now (where it emits a pure stream of VT), will continue to exist forever, because applications like sshd don't really have a buffer to read from either.

The new architecture proposed in the spec is a low-level layer on which ConPTY is built on top of. If you give it the ability to read back various info, it gives you a practically zero overhead access to the Windows console API and complete freedom to determine what constitutes a grapheme cluster, etc.

In other words, you really don't need to use it, unless you want to or otherwise have a specific need for it.

Although I'm somewhat surprised you mentioned GetConsoleCursorInfo, since presumably, you should know whether the current cursor is visible or not, right? (The CONSOLE_CURSOR_INFO::dwSize can be hardcoded to 25.)

The 2nd hardest thing in the low-level API is likely implementing the read-back functionality for the buffer contents. But I think it's feasible to implement for most, because the "grid" for reading is conceptually identical to the grid for writing VT. I know that Warp divides output into "Blocks" but I'm assuming it still has support for CUP sequences and the like, so it has support for a VT viewport and its grid too, right? Since reading from the buffer uses the same grid, I think it should generally be possible to implement with your buffer architecture as well.

The hardest API is likely the fact that the console API allows the client to create arbitrarily many off-screen text buffers whose size differs from the window size (CreateConsoleScreenBuffer). The client can switch between the different buffers (SetConsoleActiveScreenBuffer) and ideally the window size would change depending on the active buffer. But even here, I think it's possible to be somewhat "loose" with the implementation of this API, and if you disallow clients to change the window size via SetConsoleActiveScreenBuffer it probably still works with most applications. (For instance, Windows console applications sometimes use such off-screen buffers to calculate the size of Unicode text, because ambiguous width glyphs are... well... ambiguous. πŸ˜…)

All that said, and as I said above, you really don't need to use it if you have no need for it.

lhecker commented 3 months ago

The related PR is now more or less done, although I expect it to receive smaller fixes in the future. If you'd like to experiment with it, you can download our Canary build here: https://github.com/microsoft/terminal/discussions/16121

It currently includes my PR as a test. Theoretically it should be possible for you to replace your current OpenConsole.exe with the one from the Canary build and it should just work (you can find OpenConsole.exe inside the 3 portable ZIPs). I recommend using overlapped IO for optimal performance. If you find any bugs or issues, please let me know!

alokedesai commented 2 months ago

Apologies for the delay in response here. Thanks for the detailed response; we'll test this out locally and file any issues if we see them!

alokedesai commented 2 months ago

Out of curiosity, do you have a sense of when that change would hit preview? Is this the issue I should be following? https://github.com/microsoft/terminal/issues/17643

lhecker commented 2 months ago

We'll likely release this in preview at the end of this month, but there's no fixed date yet. I don't think I'll be able to address every point in #17643 any time soon, so it's not going to be closed by then. It's enough to keep following this issue, because our bot will write here whenever it gets released. πŸ™‚