thoughtbot / cocaine

A small library for doing (command) lines.
https://robots.thoughtbot.com
Other
785 stars 55 forks source link

Deadlock when spawned process writes to stderr #96

Closed kilink closed 6 years ago

kilink commented 7 years ago

When a process spawned by Cocaine writes sufficient data to stderr, both the processes will hang indefinitely. This is because Cocaine is blocked reading from stdin, while the process eventually blocks writing to stderr when the internal buffer for the pipe becomes full.

The issue can be reproduced with the following trivial script:

#!/usr/bin/env ruby

1000.times do
  STDERR.puts "*" * 1000
end

Called via Cocaine, it blocks:

Cocaine::CommandLine.new("./hang.rb").run # hangs forever

Using strace, you'll be able to see that the invoked process is blocked in a write syscall on the STDERR file descriptor. The other giveaway is that this does not block:

Cocaine::CommandLine.new("./hang.rb", "", swallow_stderr: true).run

Obviously that's not ideal, since we may want to capture both stdout and stderr.

This Zendesk blog post touches on the issue briefly in the section entitled "Bonus round: avoiding deadlocks." The solution is to avoid making blocking calls like read, and using IO::select in a loop to figure out which file descriptors are ready to be read from. Here is how it is implemented in the subprocess gem.

I see potentially fixing this in Multipipe#read, or by adding a new runner that uses the subprocess gem. Thoughts?

mmangino commented 7 years ago

Did you ever create this runner? If not, I'm going to give it a shot. We're having the same issue.

joshuapinter commented 7 years ago

Did this ever get fixed? Running into the same problem transcoding medium sized files (26mbs+).

Thanks!

joshuapinter commented 7 years ago

FYI, we tried @mmangino's branch and it works great in production for larger files. No more deadlocks. You can see more of it here: https://github.com/thoughtbot/cocaine/pull/97

jyurek commented 6 years ago

The PR that fixes this has been moved to https://github.com/thoughtbot/terrapin/pull/5. I'll close this and any necessary discussion on this point can be found there. Thanks!