taoensso / nippy

The fastest serialization library for Clojure
https://www.taoensso.com/nippy
Eclipse Public License 1.0
1.04k stars 60 forks source link

Cannot `freeze-to-file` and `thaw-from-file` objects that are larger than 2GB #172

Open RokLenarcic opened 2 months ago

RokLenarcic commented 2 months ago

Hello. Due to these functions being merely a thin wrapper around freeze and thaw they interact with a file via a byte array. That limits the size of these operations to max size of byte array which is 2GB. I'd love to have the option to freeze and thaw to DataInput/DataOutput objects or ByteBuffer objects.

Incidentally ByteBuffer offers very similar API than DataInput and DataOutput, and the benefit there is MappedByteBuffer use, which is what you get when you memory map a file.

Obviously such use cannot use the current encryptor and compressor system as those need a byte array. But I am sure there's some kind of streaming compression and encryption that could be used.

ptaoussanis commented 2 months ago

@RokLenarcic Hi Rok,

I'd love to have the option to freeze and thaw to DataInput/DataOutput objects or ByteBuffer objects.

Just checking if you noticed freeze-to-out! and thaw-from-in!?

Incidentally ByteBuffer offers very similar API than DataInput and DataOutput, and the benefit there is MappedByteBuffer use, which is what you get when you memory map a file.

I'd be open to a sketch/proof-of-concept PR of some sort if you or someone else felt like proposing a concrete way to support ByteBuffers 👍

RokLenarcic commented 2 months ago

Yeah I saw those functions, but there's a rather large amount of logic (e.g. header and options parsing/use) that is in freeze that I would have to replicate in my own code to have freeze-to-out! work approx the same.

In general the direction that would be needed for large objects is to have a protocol that has functions like a stream: write bytes and close/finalize. Then you could have encryption, compression and the writing implementation itself implement this. Then you could stack these like we do with streams, encryption wrapping the compression wrapping the data/file writing instance of this protocol.

ptaoussanis commented 2 months ago

Sketch/proof-of-concept PR welcome if you or anyone else felt like taking a look at this - otherwise I can look at this myself for a future version of Nippy.

Please 👍 here if there's interest in adding better streaming support to Nippy.