Open SeanReece opened 3 months ago
I know people love bigger version numbers, but that's not really going to buy us much, if I had to guess.
It's not nothing, but the multithreading change doesn't matter because it's not applicable here.
I know people love bigger version numbers
I have to admit I'm guilty of this far too often. I wouldn't normally have brought it up but they're touting this as a pretty substantial update, so I thought it may be of some benefit to ZFS systems using LZ4.
Like I said, I'm pretty new here so not exactly sure why multi-threading wouldn't be applicable here. My assumption was that ZFS is usually run on systems with many threads so we'd see a good performance uplift here. Is ZFS compression/decompression single threaded? I'm also not exactly sure why LZ4 (and other compression algorithms like ZSTD, etc) are brought into the openzfs code directly instead of as a dependency or using the LZ4 package directly. I haven't done any C development in years so my ignorance knows no bounds here.
Would be interesting to run see some benchmarks though.
Well, let's see.
Multi-threading only matters for multiple distinct decompression units of work, and ZFS treats it all as single-unit compression operations, so it wouldn't help us here.
ZFS also dispatches each as a separate kernel thread, so it's not like we were serializing all decompression calls before.
Userland threading and kernel threading don't necessarily have anything to do with each other in terms of complexity.
And ZFS really really wants Compress(Decompress(Compress(X)))
== Compress(X)
, which is not guaranteed to be true across versions, so it bundles its own copies of them so that that doesn't suddenly change underneath it if the Linux or FreeBSD system has a different version. (That's also why the PR you linked only updates the decompressor code.)
logbias=latency
) on write ZFS will compress data asynchronously on TXG commit, so sync write is not depended on compression speed (but it is on logbias=throughput
)Multithreading is less critical for decompression, as modern nvme drives can still be saturated with a single decompression thread. Nonetheless, the new version enhances performance by overlapping I/O operations with decompression processes.
So, on compression it may not have any gain to use LZ4 multithreading, on decompression we may get some latency speedup, but we'll eat much more CPU and memory bandwidth in this case.
Don't see any low hanging fruits for us here, IMHO.
Thanks for all the details here. I really appreciate it.
Ran a quick test lzbench with 1.9.4 vs r129 (~9 years old). r129 is when they introduced the "new" api so I couldn't quickly swap out an older version in lzbench. I believe OpenZFS is using r85 for compression so I'm not sure what performance changes (if any) were made before r129. I also tested 1.10.0, but got very similar results vs 1.9.4/1.9.3.
Tests run an a Mac M1 Pro. File used is a ~100MB text file.
128K block size: I saw about 18% increase in compression speed and 20% increase in decompression speed. 16K block size: Saw less improvement here. 5% faster compression/ 12% faster decompression
> ./lzbench -b128 -elz4 ./file.json
lzbench 1.8 (64-bit MacOS) (null)
Compressor name Compress. Decompress. Compr. size Ratio Filename
memcpy 47174 MB/s 47179 MB/s 18231014 100.00 ./file.json
lz4 r129 759 MB/s 5250 MB/s 5483206 30.08 ./file.json
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=128KB cSpeed=0MB)
./lzbench -b128 -elz4 ./file.json
lzbench 1.8 (64-bit MacOS) (null)
Assembled by P.Skibinski
Compressor name Compress. Decompress. Compr. size Ratio Filename
memcpy 44223 MB/s 46081 MB/s 18231014 100.00 ./file.json
lz4 1.9.3 642 MB/s 4348 MB/s 5483130 30.08 ./file.json
done... (cIters=1 dIters=1 cTime=1.0 dTime=2.0 chunkSize=128KB cSpeed=0MB)
So these are not earth shattering numbers here but I do think there's some performance to be gained by upgrading the compressor from r85. I'm not sure if Compress(Decompress(Compress(X)))
== Compress(X)
across these versions though.
It's not, no. Not even close.
You even knew that from the output of your own tests.
Describe the feature would like to see added to OpenZFS
Just stumbled across this substantial update to LZ4 https://github.com/lz4/lz4/releases/tag/v1.10.0
I'm fairly new here so not sure about the viability of bringing in this update but I see that LZ4 was updated to 1.9.3 in this PR: https://github.com/openzfs/zfs/pull/12805
How will this feature improve OpenZFS?
LZ4 1.10.0: https://github.com/lz4/lz4/releases/tag/v1.10.0
LZ4 1.9.4: https://github.com/lz4/lz4/releases/tag/v1.9.4