Stable / reproducible output with automatic thread scaling

tasket commented 3 years ago

I'm adding zstd support to Wyng backup using your module. I would prefer not to specify the number of threads when calling compress() so that compression can scale with the number of cores (and so I can also avoid using a lambda for compress), however that leaves open the question of whether python-zstd will then sometimes use the zstandard single-threaded mode on single-core CPUs.

My technical requirement is to compress data chunks in a reproducible way to enable deduplication during the backup process. This means always using the zstandard multi-threaded mode, even on single-core CPUs. Automatic switching between the single-threaded and multi-threaded code is what I need to avoid.

I read through the Readme and issue #48 looking for indications about the modules behavior under these conditions, but didn't find any. What I'm looking for is guidance on exactly when python-zstd uses single-threaded mode, if at all, so I can avoid it.

sergey-dryabzhinsky commented 3 years ago

compress() use all cores if number of threads not specified.
python-zstd always compile/use libzstd in multithreaded mode.
any switching to single threaded mode is up to libzstd.
you can set any number of threads even on single-core CPU.

tasket commented 3 years ago

you can set any number of threads even on single-core CPU.

This is an important point. Thank you!

sergey-dryabzhinsky / python-zstd

Stable / reproducible output with automatic thread scaling #68