Closed FrancescElies closed 2 months ago
Regarding that PSEUDOCONSOLE_INHERIT_CURSOR
comment:
In #17510 I added a timeout to cursor inheritance so in the worst case it will only block for 3 seconds. I was about to repurpose this issue to track the remaining problem: A shutdown signal should kill ConPTY instantly and not only after 3 seconds (hence the edits on your message). But then I realized that I also fixed that, so I think this has actually fundamentally been fixed in that PR.
However, then I also realized that this can't cause the freeze on shutdown. I've replaced ConPTY with a build from main
in my local wezterm and it still happened. This indicates that the freeze occurs due to a different issue.
I don't know whether that causes your specific deadlock, but wezterm does definitely (and unfortunately) make the primary mistake when using the ConPTY API: It fails to read from the output pipe when calling ClosePseudoConsole
. This is also documented here: https://learn.microsoft.com/en-us/windows/console/creating-a-pseudoconsole-session#preparing-the-communication-channels
To reproduce the issue, run a child process inside your shell (for instance WSL), print a ton of text in large chunks (for instance with cat
= chunk size 128KiB) and close the tab. The deadlock inside OpenConsole then looks like this:
This is one of the reasons why I've chosen to remove the blocking wait from the ClosePseudoConsole
API. Starting Windows 26100 (Windows 11 24H2) it'll now instantly return without waiting for the PTY to exit. (This reminds me that I need to update the winconpty package to also use that new, safer behavior.)
My suggestion for wezterm is that it reads from the output pipe until ReadFile
either returns FALSE
or lpNumberOfBytesRead
is zero. I recommend against testing for an ERROR_BROKEN_PIPE
error, because for regular pipes that's the same anyway, but for special pipes that support "graceful shutdown" (like TCP's FIN/ACK), ReadFile
will return TRUE
while lpNumberOfBytesRead
is zero. This still means that the other side wants to end things.
With the next version of ConPTY at the end of this month, the new, safer API will be available. It'll allow you to safely call ClosePseudoConsole
on the UI thread and/or in between Read/WriteFile calls. Even better, it'll support overlapped IO, which may be helpful for wezterm, considering that it uses poll()
for UNIX as well, from what I can tell. That way both implementations can use async IO with the PTY.
P.S.: I just noticed that wezterm calls it "psuedo". Not just the flag, but also the entire file ("psuedocon.rs"). π
Thanks for your quick reply.
Iβm a regular Wezterm user, what i noticed is that ClosePseudoConsole is blocking indefinitely (definitely more than 3 seconds).
Next week I should be able (hopefully will be easy) to compile conpty.dll from this repo from the main branch and reproduce the test wit βcatβ and closing the term.
Psuedocon typo i can report it back to wezterm devs, doing a pr with the rename might be received as inappropriate.
Regarding your reafile suggestion sounds something like I could try to make a patch for but apart from referring tour comment i am not sure i will be able to skillfully respond any questions reviewers might ask.
PS: how did you catch de deadlock stack? With procdump, could you give some details on that?
Next week I should be able (hopefully will be easy) to compile conpty.dll from this repo from the main branch [...]
If you hit any problems and got questions, please let me know! I'll try to get the ConPTY API updated by then.
[...] but apart from referring tour comment i am not sure i will be able to skillfully respond any questions reviewers might ask.
If the reviewers have any questions specific to ConPTY and you aren't sure about, please feel free to ping me. π (I'm familiar with Rust so that shouldn't be a problem either. π )
PS: how did you catch de deadlock stack?
That's Process Hacker, now renamed System Informer (probably to avoid getting banned by non-IT admins). It's similar to Microsoft's own Process Explorer, but in my opinion it's better because I find it visually more pleasing, it has more features, and it's open source on GitHub.
Stacks can be observed by double-clicking any application, clicking on "Threads", sorting by TID (so that their order is stable and stops changing), and click each individual thread. System Informer will automatically download PDBs and resolve symbols for you.
Of course, this only works for basic triage, but in my experience that's often enough to get an initial idea of the problem.
That aside, if you open its Options dialog, you'll find a ton of features that you can optionally enable. I wouldn't recommend enabling them outright, but I think it's worth toying around with them. Something I use for basic testing of our text renderer for instance, is under "ExtendedTools". There, you can check "Enable FPS monitoring" and you'll be able to see DWM present statistics for each individual application.
All of these are also super useful:
These are great:
You can alternatively also click this button:
These are completely nuts:
None of this requires a kernel driver, and it really only scratches the surface. System Informer is one of the greatest Windows apps I know.
Leonard thanks for the crash course on Process Informer, in the passt I installed but felt overwehlmed the the amount of options and I used to use only find handles or dlls
, you convinced me to install it again and invest some time with it.
I am trying to give you some more info to make sure what we see it's the same issue you described, today it hanged again and decided to collect infos the way you showed me.
The main problem was identifying which OpenConsole
to inspect, how do I know?. As you can see I started wezterm from windows term to see logs and had several sessions open.
Since I did not now which OpenConsole
to inspect I decided to create a dump for each of them and wezterm
too, in this case with procdump and a bit of shell ps | find opencons | each { procdump -ma $in.pid}
and inspected these files with windbg.
Out of those dumps I could see wezterm-gui
Openconsole (most of them having a similar stack)
Except one, which was showing something with ticket_lock
, I suspect this was the one hanging?π₯
broot
or nvim
(not sure right now)1.19.240130002
, couldn't see where the current version is defined in winconptyReadFile
into wezterm and see if that helpsPS: I am happy to learn better ways of doing things, please feel free to suggest me at any time how you would do it if you know a nicer way.
When you check out these older OpenConsole versions (prior to #17510), you need to check the stack of the RenderThread::s_ThreadProc
thread. That's the thread which calls WriteFile
on the output pipe that wezterm is supposed to read from. While it's writing into the pipe it's holding the ticket lock and if no one reads from the pipe it holds the lock forever.
If I'm correct, then that's why the ConsoleIoThread
(= handles incoming API calls) is stuck waiting to acquire the lock.
I will need some help building the latest conpty.dll, I read building.md
and ran Invoke-OpenConsoleBuild
I got some errors
"C:\s\oss\terminal\OpenConsole.sln" (default target) (1) ->
"C:\s\oss\terminal\src\tools\TerminalStress\TerminalStress.csproj" (default target) (114) ->
C:\s\oss\terminal\src\tools\TerminalStress\TerminalStress.csproj : error NU1101: Unable to find package Microsoft.Wi
ndowsDesktop.App.Ref. No packages exist with this id in source(s): C:\Program Files\dotnet\library-packs, TerminalDepe
ndencies
C:\s\oss\terminal\src\tools\TerminalStress\TerminalStress.csproj : error NU1101: Unable to find package Microsoft.NE
TCore.App.Ref. No packages exist with this id in source(s): C:\Program Files\dotnet\library-packs, TerminalDependencie
s
C:\s\oss\terminal\src\tools\TerminalStress\TerminalStress.csproj : error NU1101: Unable to find package Microsoft.As
pNetCore.App.Ref. No packages exist with this id in source(s): C:\Program Files\dotnet\library-packs, TerminalDependen
cies
Despite the errors I got a conpty.dll built.
β― : fd -HI conpty.dll
bin\x64\Debug\conpty.dll
obj\x64\Debug\winconpty.DLL\
obj\x64\Debug\winconpty.DLL\conpty.dll.recipe
obj\x64\Debug\winconpty.DLL\winconpty.DLL.tlog\
obj\x64\Debug\winconpty.DLL\winconpty.DLL.tlog\winconpty.DLL.lastbuildstate
terminal on ξ main [?] via .NET v8.0.303
β― : sigcheck -nobanner -q -n bin\x64\Debug\conpty.dll
n/a
Sadly I can't see it's version, I guess because I am building directly from master..
I will put that dll into wezterm and and modify wezterm to call ReadFile
before closing, let's see how that one works out.
I tried to cat and close a pane with a 1MB file big, yes the term blocked but eventually it got closed as soon as cat was done doing it's thing.
I will need some help building the latest conpty.dll, I read
building.md
and ranInvoke-OpenConsoleBuild
Despite the name, that command builds the entire solution including Windows Terminal and so it takes a long time to complete. It'll also build in Debug mode by default. Here's a command to compile it in Release mode and to only compile the two projects that are needed:
Invoke-OpenConsoleBuild '-p:Configuration=Release' '-t:Conhost\Host_EXE,Conhost\winconpty_DLL'
You can also compile it from within Visual Studio if you open the solution:
The output will be in bin\x64\Release
.
Sadly I can't see it's version, I guess because I am building directly from master..
Yes, we add the version in our build pipelines as a separate step.
I will put that dll into wezterm and and modify wezterm to call
ReadFile
before closing, let's see how that one works out.
If I'm correct with my previous assumption, you don't need this new version to fix this issue in wezterm. The issue is merely that it needs to continue reading from the output pipe until after ClosePseudoConsole
has returned. The new version is simply faster and more robust, but it's not required to fix this.
Also, yesterday I realized that wezterm can simply close/drop the output pipe handle before calling ClosePseudoConsole
. That would also fix the issue.
Other Software
wezterm 20240807-131622-ee063330
I am opening an issue here because as I understood conpty code lives here in this repo
Steps to reproduce
Sadly we couldn't manage to get consistent way to reproduce this.
We see wezterm in windows sometimes hanging when closing panes, e.g. https://github.com/wez/wezterm/issues/5882
I am suspecting this might be related to https://github.com/microsoft/terminal/issues/1810
this comment points out to a race condition, when ConPTY is being started with
CreatePseudoConsole()
with the flagPSEUDOCONSOLE_INHERIT_CURSOR
(which wezterm does) this kind of fit with the fact we can't consistenly reproduce this issue.Here the start call in wezterm.
Actual Behavior
ClosePseudoConsole
hangs