Closed bhb closed 2 years ago
Thanks, I'll have a look soon. In the newest bb we now also support the JDK 11 Http Client. And otherwise you could shell out to curl directly as a workaround.
@bhb I can't repro this. I added the options :silent false
and :err :inherit
for debugging. With those I get:
user=> (require '[babashka.curl :as curl])
nil
user=> (curl/get "https://httpstat.us/500" {:err :inherit :silent false :raw-args ["--retry" "3"]})
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 25 100 25 0 0 53 0 --:--:-- --:--:-- --:--:-- 53
Warning: Transient problem: HTTP error Will retry in 1 seconds. 3 retries
Warning: left.
100 25 100 25 0 0 110 0 --:--:-- --:--:-- --:--:-- 110
Warning: Transient problem: HTTP error Will retry in 2 seconds. 2 retries
Warning: left.
100 25 100 25 0 0 66 0 --:--:-- --:--:-- --:--:-- 66
Warning: Transient problem: HTTP error Will retry in 4 seconds. 1 retries
Warning: left.
100 25 100 25 0 0 150 0 --:--:-- --:--:-- --:--:-- 150
Execution error (ExceptionInfo) at babashka.curl/request (curl.clj:237).
babashka.curl: status 500
@bhb Could you try the above too to see if the same happens for you, with the recent commit on master?
@borkdude What do you see if you add :as :stream
to the args above?
Could you try the above too to see if the same happens for you, with the recent commit on master?
If I adapt your code above to add :as :stream
, I get the following (apologies if I'm not doing the right thing to use the latest version, I think the following should work)
First, I tried the version packaged in Babashka:
;; bb --version
;; babashka v0.6.1
(require '[babashka.curl :as curl])
(curl/get "https://httpstat.us/500" {:as :stream :err :inherit :silent false :raw-args ["--retry" "3"]})
Then I tried to include the latest babashka.curl
(require '[babashka.deps :as deps])
(deps/add-deps '{:deps {io.github.babashka/babashka.curl {:git/sha "75389c5ae0a084b33529190d654b71e021239fa4"}}})
(require '[babashka.curl :as curl])
(curl/get "https://httpstat.us/500" {:as :stream :raw-args ["--retry" "3"] :debug true})
In both cases, I get a 500 immediately and I don't see the output you posted above.
@bhb I see. I think the problem was that bb.curl throws as soon as it discovers that the status code is 500-ish. Can you try your request with :throw false
? I suspect this will fix your issue.
@borkdude In my tests, using :throw false
will return the map with the status code, but it still happens immediately i.e. without retrying.
To summarize, here are my results
(require '[babashka.deps :as deps])
(deps/add-deps '{:deps {io.github.babashka/babashka.curl {:git/sha "75389c5ae0a084b33529190d654b71e021239fa4"}}})
(require '[babashka.curl :as curl])
;; With no ':as :stream', this will retry 3 times, then throw (as expected)
(curl/get "https://httpstat.us/500" {:err :inherit
:silent false
:raw-args ["--retry" "3"]})
;; With no ':as :stream' and ':throw false' this will retry 3 times, then return map that contains status code (as expected)
(curl/get "https://httpstat.us/500" {:throw false
:err :inherit
:silent false
:raw-args ["--retry" "3"]})
;;; Unexpected behavior below
;; With ':as :stream', this immediately throws a 500 without retry
(curl/get "https://httpstat.us/500" {:as :stream
:err :inherit
:silent false
:raw-args ["--retry" "3"]})
;; With ':as :stream' and ':throw false', this immmediately returns a map with a status code (without retrying)
(curl/get "https://httpstat.us/500" {:as :stream
:throw false
:err :inherit
:silent false
:raw-args ["--retry" "3"]})
@bhb
I would say the but-last expression is unexpected but an edge case: the :status
key is filled in from the first http status code it finds in the headers and when this is 500-ish, it will throw by default. You can suppress this with :throw false
. Let's assume that you'll have to use this for your use case.
Then with the last expression. It returns immediately, because this is what :stream
is for: it won't wait until the process is finished, e.g. for consuming a stream of events.
If you def
your last expression like (def response ...)
and then do (slurp (:body resp))
you'll see this:
"500 Internal Server Error500 Internal Server Error500 Internal Server Error500 Internal Server Error"
which I think indicates that there were three retries.
But perhaps this is not your use case and you just want a blocking request and want the bytes as a result? Could you clarify more context around your use case?
Perhaps an easier way for you to work with this is call curl manually using babashka.process/process
:
(def stream (:out (babashka.process/process "curl https://avatars.githubusercontent.com/u/64927540?s=400&u=644e8997a671220cb729260cfbf678d9cfddaa1b&v=4 --retry 3" {:err :inherit})))
(io/copy stream (io/file "icon.png"))
But let me know if you want a blocking bytes call because this is pretty easy to support with :as :bytes
.
But perhaps this is not your use case and you just want a blocking request and want the bytes as a result? Could you clarify more context around your use case?
Great question - I probably should have started with this 😄 .
I am using babashka.curl
to download a file that has been stored in the cloud. If possible, it would be ideal if I could just read this file directly into memory and process it without needing to write to a separate file and then re-read it.
Unfortunately sometimes I get transient errors like 500s or 503s when I try to download the file.
The two possible outcomes for my script are:
So yes, ideally there would be a blocking call that gets the entire file but will also allow me to pass --retry
to curl so that the call doesn't fail if we get back e.g. a single 503.
I've worked around this by writing my own retry logic on top of babashka.curl
, but it seems like it would be cleaner to just let curl
take care of this.
Perhaps an easier way for you to work with this is call curl manually
That's a good idea, I can definitely do that!
But let me know if you want a blocking bytes call because this is pretty easy to support with
:as :bytes
.
Hm, maybe that's exactly what I need. The file I'm downloading is gzipped json, so I can't do anything meaningful with it until it's completely downloaded. I must have missed this option previously. My mistake!
@bhb That option isn't yet there, it was a proposal. Will respond on the rest soon.
@bhb OK, after reading your feedback, I think we can just support :bytes
which is a blocking call to get the bytes from your downloaded file with support for retries.
@borkdude That sounds great. Thanks so much!
I implemented the :as :bytes
option now.