dakrone / clj-http

An idiomatic clojure http client wrapping the apache client. Officially supported version.
http://clojars.org/clj-http
MIT License
1.77k stars 408 forks source link

Calling `.close` on response http stream will fully read stream contents rather than terminating. #627

Open phronmophobic opened 1 year ago

phronmophobic commented 1 year ago

Update: See next comment.


I created a minimal repro at https://github.com/phronmophobic/close-hang-repro.

The reproduction is fairly straightforward:

(require '[clj-http.client :as http])
(defn -main [& args]
  (let [url "https://github.com/uscensusbureau/citysdk/archive/bc93425d47508741c8fa8131ac09a53372e1e088.zip"
        response (http/request {:url url
                                :method :get
                                :as :stream})]
    (prn (:body response))
    ;; will hang indefinitely 
    (.close (:body response))))

It doesn't happen for every url, but this url will consistently reproduce the hang.

phronmophobic commented 1 year ago

I dug into this a bit more. It seems that some requests will have their input stream wrapped with a ContentLengthInputStream.

Per their docs:

Note that this class NEVER closes the underlying stream, even when close gets called. Instead, it will read until the "end" of its limit on close, which allows for the seamless execution of subsequent HTTP 1.1 requests, while not requiring the client to remember to read the entire contents of the response.

What was happening is that calling .close on the inputstream would cause the ContentLengthInputStream to read the rest of the input (aka. downloading the rest of the input). The problem is that the example url is very large and takes quite a while to download. Testing with smaller downloads has shown that the .close will eventually complete after reading the full result.

It seems closing the body inputstream should terminate the connection rather than fully consuming the stream. Is this desired behavior?

A workaround is to call close on the http client before closing the stream, (.close (:http-client response)).

phronmophobic commented 1 year ago

Here's the workaround I ended up with so that closing the streams closes the connection.

(let [body
      (proxy [FilterInputStream]
          [^InputStream (:body response)]
          (close []
            (.close (:http-client response))))]
  (partially-consume body)
  (.close body))