Add a capture_output_for_err_msg option to check_call and check_output

russelldavis commented 8 years ago

Motivation When trying to debug why a NonZeroExit occurred, the exit code by itself is usually not much help. We end up reinventing this logic in a bunch of different places (not always as robustly as this, and without tests), and not using it in other places where should (which we can now do easily by adding a single arg).

r? @metcalf

metcalf commented 8 years ago

@zenazn just spent awhile helping me understand some edge cases this introduces. Essentially this becomes an invalid invocation, even though it looks reasonable:

Subprocess.check_call(capture_output_for_err_msg: true, stdin: Subprocess::PIPE) do |p|
    p.communicate(File.open('some_file'))
end

I agree with him that supporting this in check_call would add a sharp edge to this library. On the other hand, we could support it only in check_output with pretty minimal changes. You'd just add a capture_output_for_err_msg parameter to check_output and add a stdout and stderr optional arg to NonZeroExit.

All of check_call_with_reading would go away in favor of some relatively light changes to NonZeroExit#to_s or something like that. I'd also drop the support for using stdout in the exception when stderr doesn't exist. Depending on the command this could result in totally insane output in the exception message. Simply exposing a stdout attr reader on NonZeroExit would let people access it only if their process is badly behaved and writes errors to stdout.

This basically means you'd be encouraged to always call check_output instead of check_call, but I don't think that's a problem in practice.

russelldavis commented 8 years ago

Essentially this becomes an invalid invocation, even though it looks reasonable:

The equivalent code, replacing check_call with check_output is already equally broken, and we seem to be ok with that. (Not to mention that check_call is already a sharp edge -- we allow you to do Subprocess.check_call(stdout: Subprocess::PIPE), even though that will cause the child to block forever once the pipe fills up.) If you're adding an argument that says "capture output" then it shouldn't be too surprising that things fail if you try to separately capture the output yourself.

I'd also drop the support for using stdout in the exception when stderr doesn't exist. Depending on the command this could result in totally insane output in the exception message.

A lot of the value here is that is provides useful output for unexpected edge cases, and doesn't require special code at each call site. I guarantee most callers aren't going to anticipate that edge case. The output is clearly labeled, so the only problem I can imagine is too much chunder -- I can add truncation to deal with that (I meant to do that earlier).

russelldavis commented 8 years ago

I added truncation and moved the error message logic to NonZeroExit.

r? @metcalf

metcalf commented 8 years ago

I think we should probably IRL about this on Monday and possibly include @zenazn. I'm pretty sure he considers adding this sharp edge to check_call to be a non-starter. I think it's reasonable that we should remove some of those other sharp edges (throw exceptions when you use bad combinations of options).

russelldavis commented 8 years ago

@zenazn mind responding to my comments?

zenazn commented 8 years ago

we allow you to do Subprocess.check_call(stdout: Subprocess::PIPE), even though that will cause the child to block forever once the pipe fills up

If you use check_call, it's your responsibility to use e.g., communicate (or to otherwise solve this problem).

If you're adding an argument that says "capture output" then it shouldn't be too surprising that things fail if you try to separately capture the output yourself.

One thing that's different here is that the argument says "capture output", but you're trying to "write input". It's not obvious why you can't do both.

zenazn commented 8 years ago

provides useful output for unexpected edge cases

This is great! But it introduces other sharp edges elsewhere in the library. I'd prefer this library was minimally surprising, as opposed to maximally useful in 80% of cases and very alarming / requiring workarounds in the remaining 20%.

russelldavis commented 8 years ago

@zenazn per my previous comment, please explain how this is any different than the existing sharp edge that happens if you do this:

Subprocess.check_output(capture_output_for_err_msg: true, stdin: Subprocess::PIPE) do |p|
    p.communicate(File.open('some_file'))
end

If you use check_call, it's your responsibility to use e.g., communicate (or to otherwise solve this problem).

I can equally say that if you use capture_output_for_err_msg it is your responsibility to not call p.communicate (just like it already is if you use check_output per my example above)

russelldavis commented 8 years ago

but you're trying to "write input"

ITYM you're trying to "communicate", which involves both writing input and reading output.

zenazn commented 8 years ago

I'm specifically concerned about check_call, not check_output.

But yes, there currently exists a sharp edge in check_output, where you cannot call communicate in the passed block. If I were writing this API again, I would omit the &blk parameter to check_output.

ITYM you're trying to "communicate"

No, I specifically mean trying to write input. Under the hood, it turns out the only safe way to do this is to communicate, but all the application is trying to do is write input, and it's the spooky-action-at-a-distance output-reading that the library is doing that causes the sharp edge.

russelldavis commented 8 years ago

I'm specifically concerned about check_call, not check_output

I'm concerned that this is a double standard.

it's the spooky-action-at-a-distance output-reading that the library is doing that causes the sharp edge

Really? How is it spooky that a param that says "capture output" actually reads the output? Is it also spooky that check_output does the same?

No, I specifically mean trying to write input

The point is that the method is called communicate, so it shouldn't be a total surprise that it reads output which could conflict with the option that says "capture output" (again, just like check_output).

russelldavis commented 8 years ago

@zenazn would you be ok with this if I limit the option to check_output? I do think there's a place for it in check_call, but it's a minor issue in the grand scheme -- happy to drop it.

zenazn commented 8 years ago

(omg i keep accidentally typing ctrl-enter)

I'm concerned that this is a double standard.

Sort of, yeah! But one of them is a mistake that has been part of Subprocess's public API for four years, and the other is a new proposal. I of course regret the error.

would you be ok with this if I limit the option to check_output?

Yeah, that's totally fine. One change you could very easily make (~5 LOC) that I'd be very happy with (I think I told Andrew this, but it probably didn't make it to you) is for check_output to additionally PIPE and capture stderr, and to add stdout and stderr fields to the thrown NonZeroExit instead of discarding them completely. In the check_call case, stderr and stdout would be nil (although an abstraction wrapping Subprocess could conceivably set those fields)

russelldavis commented 8 years ago

Yeah, Andrew already mentioned that above, but it leaves out the entire purpose of this, which is to add the output to the error message. Anyway, I don't have time to keep going back and forth on this, so I'm going to close this and land it in our other repo.

zenazn commented 8 years ago

Err, how do you mean? NonZeroExit could print its stdout or stderr if they're non-nil, if that's what you are looking for?

stripe / subprocess

Add a capture_output_for_err_msg option to check_call and check_output #17