Open UladzimirTrehubenka opened 1 month ago
First, what is your concern for the fork?
One big reason we keep the fork is that we expose the underlying struct, so that we can use SHA3 without having to move the inputs to the heap. (Google "golang heap escapes".) That reduces allocation significantly, which would otherwise be a big bottleneck in cryptographic primitives such as Kyber. This application is kind of niche and I doubt upstream would take that.
Also there are parallel implementations of SHA3, which again are very useful for cryptographic primitives such as Kyber.
This is not a concern, just for now as I understand std golang sha3 is not so good/fast as sha3 from the project. But nobody can use sha3 from the project because it is unexported (internal).
For implementing PQC you'd want to use the low-level SIMD version, which is public. For usual applications (like hashing a file), the difference between our implementation and the upstream API should be minimal.
BTW for now I don't see a significant difference in performance: commit d26845f
go test -bench="BenchmarkSha3|BenchmarkShake"
goos: linux
goarch: amd64
pkg: github.com/cloudflare/circl/internal/sha3
cpu: AMD Ryzen 9 3900X 12-Core Processor
BenchmarkSha3_512_MTU-24 173608 6850 ns/op 197.09 MB/s
BenchmarkSha3_384_MTU-24 242770 4928 ns/op 273.92 MB/s
BenchmarkSha3_256_MTU-24 307231 3907 ns/op 345.56 MB/s
BenchmarkSha3_224_MTU-24 323571 3710 ns/op 363.84 MB/s
BenchmarkShake128_MTU-24 380232 3153 ns/op 428.10 MB/s
BenchmarkShake256_MTU-24 347889 3446 ns/op 391.73 MB/s
BenchmarkShake256_16x-24 24442 51151 ns/op 320.30 MB/s
BenchmarkShake256_1MiB-24 444 2676574 ns/op 391.76 MB/s
BenchmarkSha3_512_1MiB-24 241 4969454 ns/op 211.00 MB/s
PASS
ok github.com/cloudflare/circl/internal/sha3 12.387s
vs. commit 3375612
go test -bench="BenchmarkSha3|BenchmarkShake"
goos: linux
goarch: amd64
pkg: golang.org/x/crypto/sha3
cpu: AMD Ryzen 9 3900X 12-Core Processor
BenchmarkSha3_512_MTU-24 175977 6810 ns/op 198.23 MB/s
BenchmarkSha3_384_MTU-24 245632 4888 ns/op 276.17 MB/s
BenchmarkSha3_256_MTU-24 308929 3876 ns/op 348.32 MB/s
BenchmarkSha3_224_MTU-24 324548 3696 ns/op 365.22 MB/s
BenchmarkShake128_MTU-24 365518 3279 ns/op 411.68 MB/s
BenchmarkShake256_MTU-24 346371 3467 ns/op 389.40 MB/s
BenchmarkShake256_16x-24 24466 48822 ns/op 335.59 MB/s
BenchmarkShake256_1MiB-24 450 2661981 ns/op 393.91 MB/s
BenchmarkSha3_512_1MiB-24 240 4976664 ns/op 210.70 MB/s
PASS
ok golang.org/x/crypto/sha3 12.352s
You don't notice the heap escape issue in this microbenchmark, but you do notice it when you're using one or the other inside, say, Kyber.
BTW golang.org/x/crypto has already improvements sha3: make APIs usable with zero allocations (May 8, 2024) similar to sha3: prevent state from escaping to heap (Sep 17, 2021).
The project is using internal/sha3 which is derived from golang.org/x/crypto/sha3. As I understand the main reason is some fundamental issues (and some optimizations/improvements). However if so is it possible to contribute improved sha3 code to the golang std lib and use it instead own implementation?