taoensso / nippy

The fastest serialization library for Clojure
https://www.taoensso.com/nippy
Eclipse Public License 1.0
1.04k stars 60 forks source link

Question: GC overhead limit exceeded handling long list of unique maps #117

Closed jdf-id-au closed 3 years ago

jdf-id-au commented 5 years ago

Hi, thanks for your amazing open source contributions.

I'm trying to use:

(fn [x]
  (with-open [out (FileOutputStream. file)]
    (nippy/freeze-to-out! (DataOutputStream. out) x))

I'm encountering OutOfMemoryError with "GC overhead limit exceeded" in my REPL when x is a lazy-seq of about eight million maps with ~20-100 simple entries each. The maps are being created from reading disk files (a few gigabytes), and I'm able to evaluate (last (make-maps from-path)) without trouble, which makes me think my problem might be coming from nippy.

Can nippy serialise data which is too big to fit in memory all at once? Could it be something to do with the value caching? I note your https://github.com/ptaoussanis/nippy/issues/81#issuecomment-204273891 and reviewed https://github.com/ptaoussanis/nippy/issues/52 and https://github.com/ptaoussanis/nippy/issues/105 .

ptaoussanis commented 5 years ago

Hi Jeremy,

Shouldn't be a caching problem (there's no automatic caching). Didn't look at this closely, but suspect the trouble you're running into is here, via here.

Nippy's probably trying to convert your entire lazy seq to a byte array.

Some alternatives might include:

Hope that helps a little, sorry for the long delay replying!

jdf-id-au commented 5 years ago

Thank you.

ptaoussanis commented 3 years ago

Closing due to inactivity