mrhooray / crc-rs

Rust implementation of CRC(16, 32, 64) with support of various standards
Apache License 2.0
187 stars 49 forks source link

Add crc128 slice-by-16 and no-table implementations #84

Closed KillingSpark closed 1 year ago

KillingSpark commented 1 year ago

This one is interesting, the speedup is way smaller because of the 128bit (=16 Bytes) wide crc. The speedup is still there but it's debate-able if slice-by-16 or bytewise algorithm should be the default. The lookup table for the slice-by-16 alg is 64 kB large for u128 entries.

Benches:

crc82/default           time:   [35.621 µs 35.642 µs 35.667 µs]
                        thrpt:  [438.08 MiB/s 438.38 MiB/s 438.64 MiB/s]

crc82/nolookup          time:   [171.31 µs 171.51 µs 171.75 µs]
                        thrpt:  [90.974 MiB/s 91.102 MiB/s 91.206 MiB/s]

crc82/bytewise          time:   [35.318 µs 35.323 µs 35.329 µs]
                        thrpt:  [442.28 MiB/s 442.34 MiB/s 442.41 MiB/s]

crc82/slice16           time:   [25.078 µs 25.084 µs 25.090 µs]
                        thrpt:  [622.75 MiB/s 622.90 MiB/s 623.05 MiB/s]
akhilles commented 1 year ago

I think we should keep bytewise as the default for u128. Really don't want applications to spillover L2 cache in a critical path.

akhilles commented 1 year ago

Thanks!