Open iBug opened 2 months ago
Before discussing default change, the first step would to make it optional to measure its characteristics comparing to the others. And any algorithm added into the tree would have to stay there forever, so it must be really that good as advertised.
Sorry I misread the man page. I thought xxhash
was already an option. Let me change this FR to adding it in the first place.
I think fletcher4 is a bit faster then current OpenZFS xxhash variants - so adding it as a new hash doesn't make sense. What version of xxHash is your intention?
Here you have some fine table with hashes and their speeds: https://rurban.github.io/smhasher/doc/table.html
Hash: Speed in MiB/s
Fletcher 4: 15556.93
xxHash64 12108.87 (included in OpenZFS - zstd)
xxHash32: 5865.17 (included in OpenZFS - zstd)
@mcmilk Your table indicates xxh64
would be a good option. I'd like to reiterate that:
xxHash is sufficiently fast but much less collision-prone than fletcher4
With modern CPU so powerful, it makes sense to me to trade a bit of performance for much better sanity by replacing fletcher4 with xxh64.
Why not sth. like rapidhash
, which has double the speed (23789 MiB/s) in that table¹ and no common problems ?
Also, with sse
and avx
the speed of fletcher-4
is a lot faster on my local notebook:
$ cat /proc/spl/kstat/zfs/fletcher_4_bench:
implementation native byteswap
scalar 9112861804 8831049465
superscalar 11681942207 11744320536
superscalar4 13586418453 11444139964
sse2 21310896019 10706136906
ssse3 21171146266 19126012775
avx2 38987296119 35445754442
¹https://rurban.github.io/smhasher/doc/table.html
Edit: I find xxh3 a nice fit:
jfyi fletcher4 on amd ryzen 7840u
with avx512:
0 0 0x01 -1 0 6423074939 87451661526534
implementation native byteswap
scalar 10659121732 8706704548
superscalar 14087467630 11536293324
superscalar4 15814463114 12305581118
sse2 22675386320 10542805362
ssse3 22375429000 20235123389
avx2 39958006169 37408283214
avx512f 42448290424 17524854325
avx512bw 42461612087 37391201332
fastest avx512bw avx2
Describe the feature would like to see added to OpenZFS
Add xxHash as an option for
checksum
and bothxxhash
andxxhash,verify
fordedup
.How will this feature improve OpenZFS?
xxHash is sufficiently fast but much less collision-prone than fletcher4. This will improve ZFS resilience against silent data corruption as a competitive alternative to fletcher4.
Additional context
Performance as advertised by xxHash on its wiki: https://github.com/Cyan4973/xxHash/wiki/Performance-comparison (Note:
fletcher4
not included in this page)Collision ratio on xxHash wiki: https://github.com/Cyan4973/xxHash/wiki/Collision-ratio-comparison