Closed JoshCheek closed 8 years ago
Update: This program, when run will hang indefinitely. When interrupted from the shell (both fish and bash do this), it kills the process, its child, and its grandchild. When interrupted with SIGINT, from another process, it kills the parent and the child, but not the grandchild. The documentation assumes a deeper knowledge of Unix than I have. Trying to read explanations of processes / process groups / sessions / etc has not improved clarity. It may be possible to prevent it from hanging, though, which would be something, at least.
require 'timeout'
require 'rbconfig'
# helper methods
def annotate(description, *allowed_exceptions, &block)
success = false
puts "\e[33mPRE #{description}\e[39m"
block.call
success = true
rescue *allowed_exceptions
success = true
ensure
puts "\e[3#{success ? 2 : 1}mPOST #{description}\e[39m"
end
def close_all(*streams)
streams.each { |io| io && !io.closed? && io.close }
end
# setup streams
parentout, childout = IO.pipe
parenterr, childerr = IO.pipe
childin, parentin = IO.pipe
# clean up when exiting the program
at_exit do
annotate("ensure streams closed") { close_all parentout, childout, parenterr, childerr, childin, parentin }
end
# the child program has its own child, both of which will hang waiting on the pipe to be closed
parentin.write <<-RUBY
puts "hello, world"
read, write = IO.pipe
if fork
write.puts "parent"
else
write.puts "child"
end
read.read
RUBY
# evaluate the code in a child process
child_pid = Kernel.spawn(RbConfig.ruby, '-', in: childin, out: childout, err: childerr)
# close streams talking to the child so that it could exit
[parentin, childout, childerr, childin].each(&:close)
# print anything we see the child print
consumer_thread = Thread.new do
parentout.each_line { |line| p stdout: line }
parenterr.each_line { |line| p stderr: line }
end
# wait for child to complete, timing out if it does not
begin
Timeout.timeout 2 do
annotate("wait for the child we started") { Process.wait child_pid }
annotate("wait for all children", Errno::ECHILD) { Process.wait }
annotate("joining consumer thread") { consumer_thread.join }
end
rescue Timeout::Error
annotate("killing child") { Process.kill "KILL", child_pid }
annotate("joining consumer thread") { consumer_thread.join }
end
While debugging, I changed the way the child is invoked. IDK if this is relevant, but it took a bit to figure out, so here's the diff (print statements were because I was trying to understand why it didn't resolve this issue)
diff --git a/lib/seeing_is_believing/evaluate_by_moving_files.rb b/lib/seeing_is_believing/evaluate_by_moving_files.rb
index a48e69e..fa07d29 100644
--- a/lib/seeing_is_believing/evaluate_by_moving_files.rb
+++ b/lib/seeing_is_believing/evaluate_by_moving_files.rb
@@ -108,7 +108,7 @@ class SeeingIsBelieving
out: child_stdout,
err: child_stderr,
child_eventstream => child_eventstream }
- child = Process.detach Kernel.spawn(env, *popen_args, opts)
+ child = Kernel.spawn(env, *popen_args, opts)
# close child streams b/c they won't emit EOF
# until both child and parent references are closed
@@ -130,15 +130,26 @@ class SeeingIsBelieving
# wait for completion
Timeout.timeout timeout_seconds do
- exitstatus = child.value.exitstatus
- consumer.process_exitstatus exitstatus
+ puts "pre wait"
+ Process.wait child # for the child we started
+ status = $?.exitstatus
+ puts "status: #{status.inspect}"
+ begin Process.wait # for any children it may have started
+ rescue Errno::ECHILD
+ end
+ puts "post wait"
+ consumer.process_exitstatus status
consumer_thread.join
end
rescue Timeout::Error
- Process.kill "KILL", child.pid
+ puts "rescue timeout"
+ Process.kill "SIGKILL", child
consumer.process_timeout timeout_seconds
+ puts "pre join"
consumer_thread.join # finish consuming events
+ puts "post join"
ensure
+ puts "ensure streams closed"
[stdin, stdout, stderr, eventstream].each { |io| io.close unless io.closed? }
end
Ugh, I think the solution I have kills parents, too -.- Here's a diff of where I'm at:
diff --git a/lib/seeing_is_believing/evaluate_by_moving_files.rb b/lib/seeing_is_believing/evaluate_by_moving_files.rb
index 9f8a5eb..6d12abe 100644
--- a/lib/seeing_is_believing/evaluate_by_moving_files.rb
+++ b/lib/seeing_is_believing/evaluate_by_moving_files.rb
@@ -104,11 +104,14 @@ class SeeingIsBelieving
)].pack('m0')
# evaluate the code in a child process
- opts = { in: child_stdin,
+ opts = {
+ child_eventstream => child_eventstream,
+ in: child_stdin,
out: child_stdout,
err: child_stderr,
- child_eventstream => child_eventstream }
- child = Process.detach Kernel.spawn(env, *popen_args, opts)
+ pgroup: true, # run in its own process group so we can kill any children it spawned
+ }
+ child_pid = Kernel.spawn(env, *popen_args, opts)
# close child streams b/c they won't emit EOF
# until both child and parent references are closed
@@ -130,17 +133,22 @@ class SeeingIsBelieving
# wait for completion
Timeout.timeout timeout_seconds do
- exitstatus = child.value.exitstatus
- consumer.process_exitstatus exitstatus
+ Process.wait child_pid
+ consumer.process_exitstatus $?.exitstatus
+ # begin Process.wait # for any grandchildren
+ # rescue Errno::ECHILD
+ # end
consumer_thread.join
end
rescue Timeout::Error
- pgid = Process.getpgid(child.pid)
- handler = trap("INT") { trap("INT", handler) } # noop
- Process.kill "-INT", pgid
consumer.process_timeout timeout_seconds
- consumer_thread.join # finish consuming events
ensure
+ begin
+ child_pgid = Process.getpgid(child_pid)
+ Process.kill "-INT", child_pgid
+ rescue Errno::ESRCH
+ end
+ consumer_thread.join
[stdin, stdout, stderr, eventstream].each { |io| io.close unless io.closed? }
end
In TextMate, when I run this code, it hangs indefinitely, even though there is a timeout set. Probably something in the parent process waits (probably either the consumer or EvaluateByMovingFiles). Or maybe the fork prevents the timeout from working correctly. Regardless, users shouldn't have to go grepping and killing processes on their own, SiB should respect the timeout.
There is a surprisingly large number of processes spawned to accomplish SiB on my machine. I've analyzed them and know where they all come from, except for why there is three of the bottommost invocation: