Closed Phantop closed 1 year ago
Can you quantify the values you use for -s
and -S
?
The way threading works is left for libzstd to decide. -T
specify a maximum number of threads and libzstd will spawn multiple workers only if the block size is big enough that it makes sense to do so.
The highest I have used for both is 64 MB at level 22, and the lowest has been 16 MB. I often make use of dwarfs and there's a very significant difference in compression time between both due to the unthreaded compression.
After some more experimentation, I have found that t2sz
uses two threads if the block size is set to 1GB. Is there any way to force it to use the number of worker threads specified?
That's strange, I get roughly 1 thread every 100MiB. I don't think I can force libzstd to use a specific number of threads but I wonder why there is such a difference.
What happens if you use the official zstd with -T
with a file 1GB in size? How many workers are spawned?
I see I have libzstd 1.5.0, I will upgrade tomorrow and test again, maybe they changed something.
What happens if you use the official zstd with
-T
with a file 1GB in size? How many workers are spawned?
An 800mb file shows two threads being used in htop
. A 1.2GB file shows me three. This is without setting a block size. Setting a block size results in 4 threads for both.
I often set the thread count to much higher than my core count when using the official zstd
command-line utility to work around that.
I checked again with 1.5.2 and it behave the same as zstd, tested on manjaro. As this is the intended behaviour I'm closing this but I'm willing to discuss possible improvements on this side.
I do know of mcmilk/zstdmt as another multithreading implementation that seems to be better than upstream for that purpose overall, being faster with smaller output and memory usage. Looking at its code I'm not sure if it would expose everything needed to create the proper seekable files, but it may provide a path to improvement.
I didn't know this project. I will take a look into it, thanks!
Using the latest git commit, the -T option simply doesn't appear to work.
t2sz
will use only a single thread according tohtop
no matter how high I set it. This happens with-s
and-S
set to both high or low values, as well as without them set.I am on Solus using zstd 1.5.2.