cognitect-labs / aws-api

AWS, data driven
Apache License 2.0
731 stars 100 forks source link

Large objects can cause OutOfMemory errors in memory constrained applications #257

Open lread opened 2 months ago

lread commented 2 months ago

Thank you for Cognitect's aws-api! I'm just dipping my toe in, so my apologies if I'm making a newbie mistake or if this issue is already well-known.

Dependencies

deps.edn

{:deps {com.cognitect.aws/api       {:mvn/version "0.8.692"}
        com.cognitect.aws/endpoints {:mvn/version "1.1.12.772"}
        com.cognitect.aws/s3        {:mvn/version "869.2.1687.0"}}}

Description

We are running cljdoc on the cheap. As such, it is memory-constrained. I've just added s3 backup/restore functionality for our SQLite db using aws-api and noticed an OutOfMemory error when using :GetObject and :PutObject.

Reproduction

To make this easy for anyone to run, I'll reproduce with MinIO Object Store via docker. (we are using Exoscale, but that is not relevant).

To launch a local MinIO server:

docker run -p 9000:9000 -p 9001:9001 --name minio \
  -e "MINIO_ROOT_USER=foouser" \
  -e "MINIO_ROOT_PASSWORD=foosecret" \
  minio/minio server /data --console-address ":9001"

I wiped up a little script to demonstrate the issue (to be used with deps.edn above). objheap.clj

(ns objheap
  (:require [clojure.java.io :as io]
            [cognitect.aws.client.api :as aws]
            [cognitect.aws.credentials :as awscreds])
  (:import (java.io RandomAccessFile)))

(defn create-test-file [file-path size-in-mb]
  (let [size-in-bytes (* size-in-mb 1024 1024)]
    (with-open [f (RandomAccessFile. file-path "rw")]
      (.setLength f size-in-bytes))))

(defn -main [& args]
  (println (format "max heap %dmb" (/ (.maxMemory (Runtime/getRuntime)) 1024 1024)))
  (let [opts (apply hash-map args)
        file-mb (parse-long (get opts "file-mb" "512"))
        bucket "foobucket"
        op (get opts "op" "put") ;; put or get
        s3  (aws/client {:api :s3
                         ;; need a valid aws region (even though we are not using aws) to overcome bug
                         ;; https://github.com/cognitect-labs/aws-api/issues/150
                         :region "us-east-2"
                         :credentials-provider (awscreds/basic-credentials-provider
                                                 {:access-key-id "foouser"
                                                  :secret-access-key "foosecret"})
                         :endpoint-override {:protocol :http
                                             :hostname "127.0.0.1"
                                             :port 9000}})]
    (aws/invoke s3
                {:op :CreateBucket
                 :request {:Bucket bucket}})

    (case op
      "put" (do (println (format "put %dmb file" file-mb))
                (create-test-file "bigfile" file-mb)
                (with-open [input-stream (io/input-stream "bigfile")]
                  (aws/invoke s3
                              {:op :PutObject
                               :request {:Bucket bucket
                                         :Key "bigfile"
                                         :Body input-stream}})))
      "get" (do
              (println "get file")
              (let [dest-file (io/file "bigfile.down")]
                (.delete dest-file)
                  (-> (aws/invoke s3 {:op :GetObject
                                      :request {:Bucket bucket
                                                :Key "bigfile"}})
                      :Body
                      (io/copy dest-file))
                  (println (format "Downloaded file: %.2fmb" (/ (.length dest-file) 1024 1024.0))))))))

(apply -main *command-line-args*)

Sanity runs

(Your max heap will differ)

Let's put a 1mb file:

$ clojure -M objheap.clj op put file-mb 1
max heap 8012mb
put 1mb file

And fetch it:

$ clojure -M objheap.clj op get
max heap 8012mb
get file
Downloaded file: 1.00mb

Ok now let's try to put a 1gb file:

$ clojure -M objheap.clj op put file-mb 1024
max heap 8012mb
put 1024mb file

And fetch it:

$ clojure -M objheap.clj op get
max heap 8012mb
get file
Downloaded file: 1024.00mb

All sane, all good.

Failing runs

Let's start by putting that 1gb file unconstrained (just in case you didn't execute the sanity runs):

$ clojure -M objheap.clj op put file-mb 1024
max heap 8012mb
put 1024mb file

And now let's try fetching the 1gb object constrained to 800mb:

$ clojure -J-Xmx800m -M --report stderr objheap.clj op get
max heap 800mb
get file
2024-09-19 11:26:30.696:INFO:oejc.ResponseNotifier:qtp1060161999-30: Exception while notifying listener org.eclipse.jetty.client.HttpRequest$10@46c28d6e
java.lang.OutOfMemoryError: Java heap space
    at clojure.lang.Numbers.byte_array(Numbers.java:1425)
    at cognitect.http_client$empty_bbuf.invokeStatic(http_client.clj:49)
    at cognitect.http_client$empty_bbuf.invoke(http_client.clj:46)
    at cognitect.http_client$on_headers.invokeStatic(http_client.clj:145)
    at cognitect.http_client$on_headers.invoke(http_client.clj:131)
    at clojure.lang.Atom.swap(Atom.java:51)
    at clojure.core$swap_BANG_.invokeStatic(core.clj:2370)
    at clojure.core$swap_BANG_.invoke(core.clj:2362)
    at cognitect.http_client.Client$fn$reify__12664.onHeaders(http_client.clj:254)
    at org.eclipse.jetty.client.HttpRequest$10.onHeaders(HttpRequest.java:530)
    at org.eclipse.jetty.client.ResponseNotifier.notifyHeaders(ResponseNotifier.java:100)
    at org.eclipse.jetty.client.ResponseNotifier.notifyHeaders(ResponseNotifier.java:92)
    at org.eclipse.jetty.client.HttpReceiver.responseHeaders(HttpReceiver.java:296)
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.headerComplete(HttpReceiverOverHTTP.java:319)
    at org.eclipse.jetty.http.HttpParser.parseFields(HttpParser.java:1247)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1529)
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.parse(HttpReceiverOverHTTP.java:208)
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.process(HttpReceiverOverHTTP.java:148)
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.receive(HttpReceiverOverHTTP.java:80)
    at org.eclipse.jetty.client.http.HttpChannelOverHTTP.receive(HttpChannelOverHTTP.java:131)
    at org.eclipse.jetty.client.http.HttpConnectionOverHTTP.onFillable(HttpConnectionOverHTTP.java:172)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:137)
    at org.eclipse.jetty.io.ManagedSelector$$Lambda/0x000077f4f7a4f448.run(Unknown Source)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
    at java.base/java.lang.Thread.runWith(Thread.java:1588)
{:clojure.main/message
 "Execution error (IllegalArgumentException) at objheap/-main (objheap.clj:49).\nNo method in multimethod 'do-copy' for dispatch value: [nil java.io.File]\n",
 :clojure.main/triage
 {:clojure.error/class java.lang.IllegalArgumentException,
  :clojure.error/line 49,
  :clojure.error/cause
  "No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
  :clojure.error/symbol objheap/-main,
  :clojure.error/source "objheap.clj",
  :clojure.error/phase :execution},
 :clojure.main/trace
 {:via
  [{:type clojure.lang.Compiler$CompilerException,
    :message
    "Syntax error macroexpanding at (/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj:52:1).",
    :data
    {:clojure.error/phase :execution,
     :clojure.error/line 52,
     :clojure.error/column 1,
     :clojure.error/source
     "/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj"},
    :at [clojure.lang.Compiler load "Compiler.java" 8177]}
   {:type java.lang.IllegalArgumentException,
    :message
    "No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
    :at [clojure.lang.MultiFn getFn "MultiFn.java" 156]}],
  :trace
  [[clojure.lang.MultiFn getFn "MultiFn.java" 156]
   [clojure.lang.MultiFn invoke "MultiFn.java" 238]
   [clojure.java.io$copy invokeStatic "io.clj" 409]
   [clojure.java.io$copy doInvoke "io.clj" 394]
   [clojure.lang.RestFn invoke "RestFn.java" 428]
   [objheap$_main invokeStatic "objheap.clj" 49]
   [objheap$_main doInvoke "objheap.clj" 12]
   [clojure.lang.RestFn applyTo "RestFn.java" 140]
   [clojure.core$apply invokeStatic "core.clj" 667]
   [clojure.core$apply invoke "core.clj" 662]
   [objheap$eval12575 invokeStatic "objheap.clj" 52]
   [objheap$eval12575 invoke "objheap.clj" 52]
   [clojure.lang.Compiler eval "Compiler.java" 7700]
   [clojure.lang.Compiler load "Compiler.java" 8165]
   [clojure.lang.Compiler loadFile "Compiler.java" 8103]
   [clojure.main$load_script invokeStatic "main.clj" 476]
   [clojure.main$script_opt invokeStatic "main.clj" 536]
   [clojure.main$script_opt invoke "main.clj" 531]
   [clojure.main$main invokeStatic "main.clj" 665]
   [clojure.main$main doInvoke "main.clj" 617]
   [clojure.lang.RestFn applyTo "RestFn.java" 140]
   [clojure.lang.Var applyTo "Var.java" 707]
   [clojure.main main "main.java" 40]],
  :cause
  "No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]",
  :phase :execution}}

Execution error (IllegalArgumentException) at objheap/-main (objheap.clj:49).
No method in multimethod 'do-copy' for dispatch value: [nil java.io.File]

The 2nd error is caused by the first (OutOfMemory) error.

And now let's try putting a 1gb object constrained to 800mb:

$ clojure -J-Xmx800m -M --report stderr objheap.clj op put file-mb 1024
max heap 800mb
put 1024mb file
{:clojure.main/message
 "Execution error (OutOfMemoryError) at java.util.Arrays/copyOf (Arrays.java:3540).\nJava heap space\n",
 :clojure.main/triage
 {:clojure.error/class java.lang.OutOfMemoryError,
  :clojure.error/line 3540,
  :clojure.error/cause "Java heap space",
  :clojure.error/symbol java.util.Arrays/copyOf,
  :clojure.error/source "Arrays.java",
  :clojure.error/phase :execution},
 :clojure.main/trace
 {:via
  [{:type clojure.lang.Compiler$CompilerException,
    :message
    "Syntax error macroexpanding at (/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj:52:1).",
    :data
    {:clojure.error/phase :execution,
     :clojure.error/line 52,
     :clojure.error/column 1,
     :clojure.error/source
     "/home/lee/proj/oss/-verify/aws-api-objects-on-heap/objheap.clj"},
    :at [clojure.lang.Compiler load "Compiler.java" 8177]}
   {:type java.lang.OutOfMemoryError,
    :message "Java heap space",
    :at [java.util.Arrays copyOf "Arrays.java" 3540]}],
  :trace
  [[java.util.Arrays copyOf "Arrays.java" 3540]
   [java.io.ByteArrayOutputStream
    ensureCapacity
    "ByteArrayOutputStream.java"
    100]
   [java.io.ByteArrayOutputStream
    write
    "ByteArrayOutputStream.java"
    132]
   [clojure.java.io$fn__11689 invokeStatic "io.clj" 310]
   [clojure.java.io$fn__11689 invoke "io.clj" 305]
   [clojure.lang.MultiFn invoke "MultiFn.java" 239]
   [clojure.java.io$copy invokeStatic "io.clj" 409]
   [clojure.java.io$copy doInvoke "io.clj" 394]
   [clojure.lang.RestFn invoke "RestFn.java" 428]
   [cognitect.aws.util$input_stream__GT_byte_array
    invokeStatic
    "util.clj"
    123]
   [cognitect.aws.util$input_stream__GT_byte_array
    invoke
    "util.clj"
    121]
   [cognitect.aws.util$eval11391$fn__11392 invoke "util.clj" 156]
   [cognitect.aws.util$eval11366$fn__11367$G__11357__11372
    invoke
    "util.clj"
    145]
   [clojure.core$update invokeStatic "core.clj" 6259]
   [clojure.core$update invoke "core.clj" 6251]
   [cognitect.aws.client.impl.Client _invoke_async "impl.clj" 140]
   [cognitect.aws.client.impl.Client _invoke "impl.clj" 123]
   [cognitect.aws.client.api$invoke invokeStatic "api.clj" 130]
   [cognitect.aws.client.api$invoke invoke "api.clj" 111]
   [objheap$_main invokeStatic "objheap.clj" 36]
   [objheap$_main doInvoke "objheap.clj" 12]
   [clojure.lang.RestFn applyTo "RestFn.java" 140]
   [clojure.core$apply invokeStatic "core.clj" 667]
   [clojure.core$apply invoke "core.clj" 662]
   [objheap$eval12575 invokeStatic "objheap.clj" 52]
   [objheap$eval12575 invoke "objheap.clj" 52]
   [clojure.lang.Compiler eval "Compiler.java" 7700]
   [clojure.lang.Compiler load "Compiler.java" 8165]
   [clojure.lang.Compiler loadFile "Compiler.java" 8103]
   [clojure.main$load_script invokeStatic "main.clj" 476]
   [clojure.main$script_opt invokeStatic "main.clj" 536]
   [clojure.main$script_opt invoke "main.clj" 531]],
  :cause "Java heap space",
  :phase :execution}}

Execution error (OutOfMemoryError) at java.util.Arrays/copyOf (Arrays.java:3540).
Java heap space

Observation

The http client seems to be loading the entire object into memory for GetObject and PutObject. When the object is big, and memory is limited, this can cause OutOfMemory errors.

Is there some way I can work around this?

lowecg commented 1 month ago

I have large S3 object downloads working with the technique described here:

https://github.com/cognitect-labs/aws-api/issues/209#issuecomment-1476571310

lread commented 1 month ago

Thanks @lowecg! I ended up using the AWS SDK v2 for now.