cloudflare / circl

CIRCL: Cloudflare Interoperable Reusable Cryptographic Library
http://blog.cloudflare.com/introducing-circl
Other
1.26k stars 140 forks source link

Why own sha-3 implementation? #505

Open UladzimirTrehubenka opened 1 month ago

UladzimirTrehubenka commented 1 month ago

The project is using internal/sha3 which is derived from golang.org/x/crypto/sha3. As I understand the main reason is some fundamental issues (and some optimizations/improvements). However if so is it possible to contribute improved sha3 code to the golang std lib and use it instead own implementation?

bwesterb commented 1 month ago

First, what is your concern for the fork?

One big reason we keep the fork is that we expose the underlying struct, so that we can use SHA3 without having to move the inputs to the heap. (Google "golang heap escapes".) That reduces allocation significantly, which would otherwise be a big bottleneck in cryptographic primitives such as Kyber. This application is kind of niche and I doubt upstream would take that.

Also there are parallel implementations of SHA3, which again are very useful for cryptographic primitives such as Kyber.

UladzimirTrehubenka commented 1 month ago

This is not a concern, just for now as I understand std golang sha3 is not so good/fast as sha3 from the project. But nobody can use sha3 from the project because it is unexported (internal).

bwesterb commented 1 month ago

For implementing PQC you'd want to use the low-level SIMD version, which is public. For usual applications (like hashing a file), the difference between our implementation and the upstream API should be minimal.

VladimirTregubenko commented 1 month ago

BTW for now I don't see a significant difference in performance: commit d26845f

go test -bench="BenchmarkSha3|BenchmarkShake"
goos: linux
goarch: amd64
pkg: github.com/cloudflare/circl/internal/sha3
cpu: AMD Ryzen 9 3900X 12-Core Processor            
BenchmarkSha3_512_MTU-24          173608          6850 ns/op     197.09 MB/s
BenchmarkSha3_384_MTU-24          242770          4928 ns/op     273.92 MB/s
BenchmarkSha3_256_MTU-24          307231          3907 ns/op     345.56 MB/s
BenchmarkSha3_224_MTU-24          323571          3710 ns/op     363.84 MB/s
BenchmarkShake128_MTU-24          380232          3153 ns/op     428.10 MB/s
BenchmarkShake256_MTU-24          347889          3446 ns/op     391.73 MB/s
BenchmarkShake256_16x-24           24442         51151 ns/op     320.30 MB/s
BenchmarkShake256_1MiB-24            444       2676574 ns/op     391.76 MB/s
BenchmarkSha3_512_1MiB-24            241       4969454 ns/op     211.00 MB/s
PASS
ok      github.com/cloudflare/circl/internal/sha3   12.387s

vs. commit 3375612

go test -bench="BenchmarkSha3|BenchmarkShake"
goos: linux
goarch: amd64
pkg: golang.org/x/crypto/sha3
cpu: AMD Ryzen 9 3900X 12-Core Processor            
BenchmarkSha3_512_MTU-24          175977          6810 ns/op     198.23 MB/s
BenchmarkSha3_384_MTU-24          245632          4888 ns/op     276.17 MB/s
BenchmarkSha3_256_MTU-24          308929          3876 ns/op     348.32 MB/s
BenchmarkSha3_224_MTU-24          324548          3696 ns/op     365.22 MB/s
BenchmarkShake128_MTU-24          365518          3279 ns/op     411.68 MB/s
BenchmarkShake256_MTU-24          346371          3467 ns/op     389.40 MB/s
BenchmarkShake256_16x-24           24466         48822 ns/op     335.59 MB/s
BenchmarkShake256_1MiB-24            450       2661981 ns/op     393.91 MB/s
BenchmarkSha3_512_1MiB-24            240       4976664 ns/op     210.70 MB/s
PASS
ok      golang.org/x/crypto/sha3    12.352s
bwesterb commented 1 month ago

You don't notice the heap escape issue in this microbenchmark, but you do notice it when you're using one or the other inside, say, Kyber.

UladzimirTrehubenka commented 1 month ago

BTW golang.org/x/crypto has already improvements sha3: make APIs usable with zero allocations (May 8, 2024) similar to sha3: prevent state from escaping to heap (Sep 17, 2021).