Closed johrstrom closed 2 months ago
Note that #3511 isn't causing this directly - but could have uncovered it. If the dashboard hangs, it's not likely to have anymore open files. At which point - the old implementation of lsof
checking for apps would have indicated there are no running apps. And the PUN can be restarted.
But since we started to use ps
, ps
still sees this app as running and therefor won't stop the PUN.
I'm able to replicate this in dev when the project NFS drives are behind a firewall (i.e., any attempt to access them hangs forever).
However, if I do this work in another thread - I cannot stop/kill that thread. (doing the work in the main thread a ctrl+c
does not stop the main thread). So I'm trying to work out how I can in fact kill a thread when it's in this state.
This is a duplicate of #240
We're having issues at OSC with PUNs getting into bad states. Looking at a PUN that's been running for some time (over a week at this point). Running
kill -3
on a process gave this stack trace where the dashboard is waiting onFile.lstat
to return.