Rogdham / pyzstd

Python bindings to Zstandard (zstd) compression library, the API style is similar to Python's bz2/lzma/zlib modules.
https://pyzstd.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
10 stars 3 forks source link

weird multithreading performance #15

Open ThomasWaldmann opened 1 week ago

ThomasWaldmann commented 1 week ago

I did some practical experiment there and got weird results:

https://github.com/borgbackup/borg/issues/8217#issuecomment-2170637689

Rogdham commented 1 week ago

Hello, I gave a look to your script, and the following caught my eyes:

  1. The data you generate is not only random in content, but also random in size. I feel like the outputs would be more reliable if you generate the data only once, save it into a global variable, and use the very same data in each case.
  2. The data to be compressed is very small: between 0.5 and 4MiB.
  3. The jobSize is very small as well: 512kiB

Have you tried running the same test with bigger data sizes?

Another thing that way be worth investigating is running the zstd command directly to see if the figures you have are inherent to Zstandard or are more specific to the pyzstd library.