Explicit killing - Githubissues

AndrewRadev commented 12 years ago

During my work on a plugin, I noticed two problems with vimrunner:

With a single Vim instance used to run all the tests, at some point it seems to just hang and do nothing. This only occurred with a terminal Vim, GUI Vim always seems to run fine.
With a Vim instance per test, after the tests are over, a lot of leftover Vim instances are still alive.

The first problem is still a mystery to me, but it seems like the second problem is caused by the Vim processes simply not being killed properly.

Now, according to the PTY documentation, PTY#spawn detaches the process when it's done with it and dedicates a separate thread whose purpose is to kill it. If I monitor what's happening with htop, it does seem like each vimrunner gets a ruby thread next to it once it's done. I'm only guessing, but I think that, afterwards, after the master process is over, the threads are just not waited for, so the processes never get the chance to get reaped. This would explain why a single process doesn't seem to have this problem -- it's just fast enough to kill it at the end, so everything seems fine. But, again, this is only my guess. I played around a bit with Thread#wait and Process.wait, but I couldn't figure out something that works.

Explicitly sending a Process.kill seems to clean things up nicely, but the signal needs to be 9. The usual 15 doesn't do a thing for some reason, at least for me. Since the kill call is meant to just get rid of the process, this seems okay. Just in case, I put a noswapfile nobackup in the vimrc to avoid accidental leftovers, though I'm not entirely sure if something like this is necessary. I also wrote some specs, but even without the kill, they only fail occasionally. They do seem to fail more often than not, but it's still kind of iffy.

@mudge, if you have some time one of these days, could you check this out and see how it works on a Mac? I'd also really appreciate your thoughts on the matter and whether this is a problem on OSX as well. You can test the leftover Vim instances on your own, but you could also check out my switch.vim project -- I spawn a Vim instance per test there, so you should be able to see the issue manifesting.

mudge commented 12 years ago

Apologies for the delay in looking at this, I'll test it out on switch.vim and let you know my findings.

mudge commented 12 years ago

Using this branch, the full switch.vim test suite passes on my Mac but without it, you're right: loads of Vim instances are left running (seemingly one for each failed spec).

When I investigated with ps: no associated ruby processes were running: it was as if detaching did not also kill the processes. Your new version explicitly states what we want to happen but I agree that this behaviour is ill-understood: why does detaching a headless process seem to kill it fine but not if it has a GUI?

I experimented with blocking until Process#detach finished but this seemed to hang indefinitely:

thread = Process.detach(@pid)
puts "Waiting..."
thread.join
puts "Done."

Would never reach "Done" and simply keep the Vim instance running until I manually killed it.

AndrewRadev commented 12 years ago

why does detaching a headless process seem to kill it fine but not if it has a GUI?

It doesn't seem like this problem is limited to the GUI. If you run this on a linux box with terminal Vim, you'll notice the same issue, it just won't be this visible.

When I investigated with ps: no associated ruby processes were running: it was as if detaching did not also kill the processes.

I noticed associated ruby threads (at least I think they're threads) in htop, while the specs were running, but not afterwards, no.

I experimented with blocking until Process#detach finished but this seemed to hang indefinitely:

Same here. I think this may have to do with the fact that Vim is an interactive process. Even if you wait for it to finish, it's never going to finish on its own. I don't know, honestly, this whole "spawn a separate thread to kill a detached process" seems like a weird decision to me, so I can't really speculate much without knowing what this decision is based on. I'm following a ruby mailing list for Unix systems programming in ruby (http://www.ruby-forum.com/topic/2535619), so I'm going to ask there for some more information one of these days.

For now, I'm going to merge this into master, since I really want to avoid this problem with cleanup. I'll write here if I learn anything new on the matter.

mudge commented 12 years ago

We can probably drop the call to detach if we are explicitly killing the process as that now takes its place. The one thing to watch out for is rescue Errno::ESRCH being too defensive in this regard.

AndrewRadev commented 12 years ago

You're probably right about removing the call to detach. I've pushed a commit to remove it.

As for the Errno:ESRCH, if the process dies by itself, the Process.kill would fail due to not finding a process with that pid. I think there was a ChildExited exception, but I think it gets raised when trying to read the child's output, which we don't do, so we probably couldn't rely on that.

Incidentally, I started writing an email to "usp.ruby" regarding Process.detach and in the process, I figured out the problem :). Reading the documentation again, I noticed the description of the spawned thread to be:

whose sole job is to reap the status of the process pid when it terminates.

The important part is "when it terminates". Basically, the detach doesn't take care of terminating the child, only for collecting the status if it does terminate. That's why issuing a thread.join hangs forever, the Vim instance just sits there doing nothing. A Process.kill really is required if you want to stop the process.

AndrewRadev / vimrunner

Explicit killing #10