babashka / babashka.curl

A This library is mostly replaced by https://github.com/babashka/http-client
Eclipse Public License 1.0
121 stars 9 forks source link

Resource leak #17

Closed lukaszkorecki closed 4 years ago

lukaszkorecki commented 4 years ago

I wrote a small daemon which monitor's ECS metadata endpoint and reports basic Docker stats via Statsd protocol. So far soo good! It runs on a 10s interval - so every 10s it will use babashka.curl to poll the endpoint. I've noticed this morning that at some point we stopped receiving metrics from our services. I've looked into the logs and this line popped up:

:message Cannot run program "curl": error=24, Too many open files"

I looked at the source and it looks like despite calling .delete on the temp file, file handles are not freed. I wrote a simple babashka script which polls a local web server in a very tight loop (1s). Then checked the file handles opened by the process:


# - ubuntu: lsof -p 4160 
# ...removed garbage....
bb      4160 ubuntu    5r   REG    8,1      477   4657 /var/tmp/babashka.curl2791103685985735359.headers (deleted)
bb      4160 ubuntu    6r   REG    8,1      477   6180 /var/tmp/babashka.curl4321920526964170231.headers (deleted)
bb      4160 ubuntu    7r   REG    8,1      477  10101 /var/tmp/babashka.curl4676676831291833005.headers (deleted)
bb      4160 ubuntu    8r   REG    8,1      477  12710 /var/tmp/babashka.curl9129177778621420130.headers (deleted)
bb      4160 ubuntu    9r   REG    8,1      477  12713 /var/tmp/babashka.curl1755931921803218887.headers (deleted)
bb      4160 ubuntu   10r   REG    8,1      477  12715 /var/tmp/babashka.curl2374590227721063813.headers (deleted)
bb      4160 ubuntu   11r   REG    8,1      477  12716 /var/tmp/babashka.curl7555690216220885691.headers (deleted)
bb      4160 ubuntu   12r   REG    8,1      477  12717 /var/tmp/babashka.curl5783864275205825599.headers (deleted)
bb      4160 ubuntu   13r   REG    8,1      477  12718 /var/tmp/babashka.curl3391626115413234840.headers (deleted)
bb      4160 ubuntu   14r   REG    8,1      477  12719 /var/tmp/babashka.curl3294222859950290307.headers (deleted)
bb      4160 ubuntu   15r   REG    8,1      477  12720 /var/tmp/babashka.curl1858309968829563681.headers (deleted)
bb      4160 ubuntu   16r   REG    8,1      477  12721 /var/tmp/babashka.curl2708258064880582016.headers (deleted)
bb      4160 ubuntu   17r   REG    8,1      477  12722 /var/tmp/babashka.curl4878472749755100471.headers (deleted)
bb      4160 ubuntu   18r   REG    8,1      477  12723 /var/tmp/babashka.curl3797198481882795669.headers (deleted)
bb      4160 ubuntu   19r   REG    8,1      477  12724 /var/tmp/babashka.curl5798938650515700686.headers (deleted)
bb      4160 ubuntu   20r   REG    8,1      477  12725 /var/tmp/babashka.curl8926284960367711330.headers (deleted)
bb      4160 ubuntu   21r   REG    8,1      477  12726 /var/tmp/babashka.curl2296622041669157621.headers (deleted)
bb      4160 ubuntu   22r   REG    8,1      477  12727 /var/tmp/babashka.curl3712698982776823011.headers (deleted)
bb      4160 ubuntu   23r   REG    8,1      477  12728 /var/tmp/babashka.curl2307481706378348242.headers (deleted)
bb      4160 ubuntu   24r   REG    8,1      477  13300 /var/tmp/babashka.curl1480999808202088278.headers (deleted)
bb      4160 ubuntu   25r   REG    8,1      477  15667 /var/tmp/babashka.curl6417597933549367042.headers (deleted)
bb      4160 ubuntu   26r   REG    8,1      477  15668 /var/tmp/babashka.curl961934724018028790.headers (deleted)
bb      4160 ubuntu   27r   REG    8,1      477  15670 /var/tmp/babashka.curl809665755915233747.headers (deleted)
bb      4160 ubuntu   28r   REG    8,1      477  15671 /var/tmp/babashka.curl241764706011866924.headers (deleted)
bb      4160 ubuntu   29r   REG    8,1      477  15672 /var/tmp/babashka.curl3198063764130376943.headers (deleted)
bb      4160 ubuntu   30r   REG    8,1      477  15673 /var/tmp/babashka.curl2395939605882077064.headers (deleted)
bb      4160 ubuntu   31r   REG    8,1      477  15674 /var/tmp/babashka.curl526193298524954950.headers (deleted)

#....

#   - ubuntu : lsof -p 4160 2>/dev/null | grep headers | wc -l
510

Of course on exit all file handles are freed - not sure what the fix is, as the Java doc for File doesn't indicate any other way of freeing the handles.

borkdude commented 4 years ago

@lukaszkorecki Maybe parsing headers via a temporary file isn't the best idea ever. This was done because when parsing the headers from the same stream as the response with the --include option, there is no reliable way to detect when the headers end and the response body starts. Could it be that we have to call .destroy on the curl process to free up these resources?

borkdude commented 4 years ago

I can reproduce the problem locally with this script:

(require '[babashka.curl :as curl])

(while true
  (Thread/sleep 1000)
  (let [resp (curl/get "https://www.clojure.org")]
    (:body resp)
    (.destroy (:process resp))) ;; this is not helping
)

Note that this script doesn't exhibit the problem:

(while true
  (Thread/sleep 1000)
  (let [f (java.io.File/createTempFile "babashka.curl" ".headers")]
    (spit f "foo")
    (.delete f)))
lukaszkorecki commented 4 years ago

Yeah, the process included in the response map was my first hint - and I did try the .destroy method on it and it didn't help. I wonder if it needs to be called before deleting the tempfile?

borkdude commented 4 years ago

@lukaszkorecki I'll do some testing...

Meanwhile you can poll your endpoint using something like this:

(clojure.java.shell/sh "curl" "https://www.clojure.org")))

or use HttpURLConnection as follows:

https://github.com/borkdude/deps.clj/blob/64589845e014191b33d97cb1983dca504ba1dff2/src/borkdude/deps.clj#L188.

The library https://github.com/borkdude/clj-http-lite is based around HttpURLConnection and this library can also be used with babashka.

borkdude commented 4 years ago

So far I've found that running with the JVM (instead of GraalVM) has the same behavior so at least it's not a GraalVM problem.

borkdude commented 4 years ago

@lukaszkorecki Found the culprit!

(defn- read-headers [^File header-file]
  (line-seq (io/reader header-file)))

This leaves the file handle open. Will fix ASAP.

lukaszkorecki commented 4 years ago

@borkdude oh, nice! Thank you 🎉

borkdude commented 4 years ago

@lukaszkorecki Will publish a new version of bb. Thanks for finding this.

lukaszkorecki commented 4 years ago

@borkdude Thank you for creating BB - finally it feels like we have a viable Clojure for small tasks :-)

borkdude commented 4 years ago

Thank you. Released: https://github.com/borkdude/babashka/releases/tag/v0.0.86