coolbutuseless / zstdlite

Fast, configurable in-memory compression of R objects with zstd
Other
26 stars 0 forks source link

Use streaming compression #4

Closed coolbutuseless closed 7 months ago

coolbutuseless commented 7 months ago

To avoid creating a full serialized buffer and then compressing it, switch to streaming compression using ZSTD_compressStream2().

This would change the compression flow to:

This may offer some speedup over the current waterfall process of (1)full serialization (2) compress everything

coolbutuseless commented 7 months ago

See the stream branch. All the scaffolding is in place, and streaming compression works.

Initial timing showed that it may be fractionally faster than non-streaming mode, but only for serializing R objects > 20megabytes. For smaller objects it seems marginally slower.

For the small speedup, I don't think this is worth the extra complexity on the implementation.

Need to re-evaluate with multithreading enabled.

coolbutuseless commented 7 months ago

with multithreading enabled (and using 4 threads) see ~2x speed up for moderate sized serializations.

Included in https://github.com/coolbutuseless/zstdlite/commit/295af4841c3ac8eaab2801737f537f64bee6215d