Closed mqudsi closed 5 years ago
@mqudsi - You will need Windbg/kd for that. Once the system hangs, you will have to break in to the debugger and see if there is a deadlock between lxcore processes (try the !stacks
debugger extension).
@sunilmut What are the "lxcore processes"? Right now on build 17074 I can see the following: The launcher process (wsl.exe), optionally distribution launcher (e.g. ubuntu.exe), the wsl host wslhost.exe and the LxssManager service running inside one of the svchost.exe instances. And then of course the wsl processes themselves, but you have blocked those from being opened with anything but PROCESS_QUERY_LIMITED_INFORMATION :( (what's the point in that anyways?)
@poizan42 - He means ELF processes (/bin/bash, etc).
@benhillis, but you just get access denied if try to open those in windbg. I actually tried using Process Hacker to launch WinDbg with System integrity and all 31 privileges activated, and it is still blocked, so seemingly they can't be opened from usermode at all for for debugging.
@poizan42 - If there's a deadlock it's going to be in kernel mode, not user mode so attaching a user mode debugger isn't going to be useful. The easiest way for us to debug subsystem hangs is by looking at a memory dump. It's likely this is #2849 for which we have a fix inbound.
fwiw, It's unlikely this was the same issue as it was the shutdown sequence for neovim that caused the issue in this particular case, which shouldn't have referenced anything outside the WSL environment.
@mqudsi - In that case if you could collect a memory dump and forward it along to secure@microsoft.com it would be greatly appreciated.
@benhillis That makes sense, but since you need to have enabled kernel mode debugging already it won't help much if you are randomly encountering a hang unless you can reproduce it or happen to be running with kernel mode debugging enabled, which didn't sound like was the situation for @mqudsi.
Actually Process Hacker can use its kernel driver to show the kernel mode stack which might be the best thing available in this case.
@mqudsi Did you find a fix for your freezing problem? I've started getting it today on build 17074.
@poizan42 - Yes, that's mostly correct. If you are encountering a hang in launching bash and it feels like a deadlock, then there are two options:
@sunilmut Do you guys at MS run with live kernel debuggers, or do you generally generate and then debug crashes?
If you do run with live kernel debugging machines attached, I'm wondering if you guys literally frankenstein together two PCs or if you have special hardware.
@fpqc it used to be so hard, but these days a bidirectional usb 3.0 a-a cable is all it takes.
@mqudsi Neat! Is local kdb suitable for debugging something like WSL? If not, what about offline debugging with the LiveKD tool? It looks like these features were added recently: https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/performing-local-kernel-debugging
They're actually old features, but the article was recently updated. So long as your PC isn't crashed (BSoD/GSoD) or totally hung, local kdb is fine.
A typical setup is a dev box and a test machine or test VM with a kernel debugger attached. Personally I use one physical machine and a couple VMs with different memory and virtual processor counts.
I've experienced this a number of times, but pretty sporadically, and I'm not sure how to go about getting information that would help debug this.
Sometimes after performing an action in the WSL environment, I end up with a completely deadlocked lxss where all existing (launched) WSL processes immediately become unresponsive and any new commands (even just
bash
) hang indefinitely.I just experienced this on 16299.