babashka / babashka.curl

A This library is mostly replaced by https://github.com/babashka/http-client
Eclipse Public License 1.0
121 stars 9 forks source link

Capture standard error from curl shell command #16

Closed jaydeesimon closed 4 years ago

jaydeesimon commented 4 years ago

I'm interested in capturing the standard error and doing something with it. For example, when setting a timeout {:raw-args ["--max-time" "30"]}. Full example:

user=> (curl/get "https://nytimes.com" {:raw-args ["--max-time" "0.01"]})
curl: (7) Failed to connect to nytimes.com port 443: Operation timed out
{:status nil, :headers {}, :body "", :process #object[java.lang.UNIXProcess 0x3ed14362 "java.lang.UNIXProcess@107e169f8"]}
user=> (-> *1 :process (.getErrorStream) slurp)
""

It looks like the main reason it's empty is because the process's standard error destination is being set to ProcessBuilder$Redirect/INHERIT here and therefore not present in the error input stream.

The lowest hang fruit thing I can think of is to remove that line so that the default destination is ProcessBuilder$Redirect/PIPE but maybe there's a reason it's set to ProcessBuilder$Redirect/INHERIT?

Beyond that, there's probably better ways to get at the fact that something went wrong and curl returned an error besides manually pulling it from the error input stream (for example, adding it to the result map with a curl error key or throw an exception with details in an ExceptionInfo object).

Let me know what you think. I'd be happy to work on a PR.

-Jeff

borkdude commented 4 years ago

Hi Jeff,

I think it makes sense to do something similar to clj-http and throw when the status code is somewhere in the 400-500 status code area. We could capture the error stream as a string and send the error string with the exception message. When we throw an ex-data we can also send along some additional key/values in a map. People could choose to opt out of throwing exceptions and then the error stream could be captured in a key :error when the error stream was non-empty?

Right now the :process is returned, which is undocumented, but I thought this could be useful to obtain e.g. the curl exit status code:

user=> (def p (:process (c/get "https://oopstypo.org")))
curl: (6) Could not resolve host: oopstypo.org
#'user/p
user=> (.waitFor p)
6

A PR on this would be welcome!

jaydeesimon commented 4 years ago

Sounds good. I will read up on clj-http’s behavior with exceptions. For example, a timeout will always throw an exception but a 400-500 status code may or may not depending on {:throw-exceptions? false}.

For my immediate purposes, thanks for bringing up the exit status code. I can use that to capture whether a timeout occurred and then do something about it.

jaydeesimon commented 4 years ago

Hi @borkdude. Here's a first-pass at a PR that captures the standard error and adds a key to the result map if there is an error from the curl shell command (non-zero exit status). I wasn't sure about how it should behave with respect to exceptions. For example, clj-http allows you to opt out of throwing exceptions {:throw-exceptions? false} but this is specific to http status codes.

An exception could be thrown by default if there is a non-zero exit code but that feels like a different concern than throwing an exception for an exceptional http status code. Maybe a different option makes sense like {:throw-curl-exceptions? false}?

borkdude commented 4 years ago
jaydeesimon commented 4 years ago

The above makes sense. I can work on a PR with the above when I get another round of free time.

If it throws an exception when the http status code is not "successful" by default and/or throws an exception when the error code is non-zero, that would make the behavior backwards incompatible. Is that ok?

borkdude commented 4 years ago

@jaydeesimon Since this library is fairly new and there is a warning about potential breaking changes in the README, I think that is ok.

A note on style: I'd like to avoid keywords with question marks in them. So I think :throw would be a good keyword, instead of :throw?. This also allows use to extend this setting later on to something like: {:throw {exclude [404]}} or {:throw {404 false}} or something. See https://github.com/bbatsov/clojure-style-guide/issues/182#issuecomment-516969043

borkdude commented 4 years ago

Turns out we could throw when streaming, because the status code is known before the stream returns. I've made that change as well.

jaydeesimon commented 4 years ago

Great. Thanks @borkdude !