Open nazar-pc opened 5 months ago
The issue with Rayon or any dependencies is that it brings a lot of new dependencies that need to be audited or at least vetted. This is very undesirable in a cryptographic product securing billions of assets. It is especially critical when supply chain attacks are increasing in appeal.
Furthermore, a simple threadpool written with cryptographic/security application in mind and fully control by the dev is significantly easier to maintain and update if there is a security issue.
The issue with Rayon or any dependencies is that it brings a lot of new dependencies that need to be audited or at least vetted. This is very undesirable in a cryptographic product securing billions of assets. It is especially critical when supply chain attacks are increasing in appeal.
rayon
is a well known established library with short list of mandatory dependencies and it used by Rustc itself, so it is not like you're adding a lot of new stuff here.
Furthermore, a simple threadpool written with cryptographic/security application in mind and fully control by the dev is significantly easier to maintain and update if there is a security issue.
I generally agree with you on this one. However, as I have shown in the description already, there were some questionable decisions from soundness point of view around thread pool as it was used and performance was lower than it could have been.
You 100% can rewrite original code and make it faster without rayon, achieving rayon's features in a safe misuse-resistant way is non-trivial and I'd rather use something that it maintained by Rust experts for this purpose (check top contributors to rayon project, they are literally working on Rust itself) than seeing every library invent their own a bit flawed version of the same features that doesn't integrated as well with the rest of Rust ecosystem.
I would estimate risk of relying on rayon
as very low, but decision is up to blst maintainers of course.
Just confirmed that performance improvements here translate to performance improvements downstream in our usage usage of https://github.com/sifraitech/rust-kzg (uses blst) in https://github.com/subspace/subspace, both of which also take advantage of Rayon as well.
Just confirmed that performance improvements here translate to performance improvements downstream in our usage usage of https://github.com/sifraitech/rust-kzg (uses blst) in https://github.com/subspace/subspace, both of which also take advantage of Rayon as well.
I don't know whether commitment, proof generation or verification is the bottleneck for your use-case but @sauliusgrigaitis might be able to point you on how to best use rust-kzg, i.e. either through native BLST or reimplemented+parallelized BLST (i.e. using their own multithreading strategy).
Also they've been working on a new set of benchmarks. I have my own in https://github.com/mratsim/constantine/pull/304#issuecomment-1844795359, but I don't cover parallel BLST.
@dot-asm is there something I can do to interest you reviewing this?
This os honestly frustrating. There are merge conflicts again and not review for clear performance and API improvements in half a year :confused:
@dot-asm any chance you can look into this any time soon? Any feedback at all would be nice.
This PR introduces a bunch of changes that target at better performance and higher code quality.
Performance
Before:
After:
The reason is most likely due to higher parallelism and not using channels in those operations.
Rayon
Rayon doesn't always result in better performance than manual thread management (see https://github.com/rayon-rs/rayon/issues/1124), but changes in this PR do result in higher performance.
There are many great benefits to rayon though in the way it is used:
safe API, previous thread pool usage was either unsound or sound by accident (comments were claiming things like:
Which while worked by accident, was a lie (nothing was actually joined in most cases, but worked in practice due to use of channels to send work back to the parent thread), fundamentally unsound API (parent thread panic would result in worker threads accessing undefined memory location) and generally a major foot gun for anyone who would try to change the code in any way.
blst
will now enjoy it automaticallyVarious cleanups
I changed all but one
channel
usage with a cleaner code usingMutex
,Arc
usage was removed as unnecessary and some other bookkeeping was cleaned up after understanding what code does and rewriting intention in more idiomatic Rust code (seefn from
for example). The only place wherechannel
still remains isfn mult
, but it is quite difficult code and I didn't have the will to decipher what it tries to do to get rid of channel, but hopefully examples in this PR will prompt someone to clean it up at some later point.Some
unsafe
blocks withvec.set_len()
are moved after memory is actually initialized and some questionable in terms of soundness places were refactored such that they are quite obviously sound now. Some remaining places can also be fixed, but require pointer arithmetic that might require newer versions of Rust, so I left TODOs there.I also took liberty to refactor things like
&q[0]
toq.as_ptr()
since while both result in the same pointer, the former is deceptive because it suggests to the reader that only the first element is used, while in practice the wholeq
is used by the function that is being called later (documentation is almost non-existent, so it wasn't obvious at first and I hope this improves readability).no-threads
no-threads
was a hack and non-idiomatic because it violates "additivity" assumptions of Rust features. Thankfully it is no longer necessary and can be safely removed, but since this is a breaking change (the only one in this PR to the best of my knowledge), I bumped crate version from0.3.11
to0.4.0
.How to review this PR
This PR is split into self-contained logically structured commits and it is recommended to review commits one by one instead of the complete diff, this way it is much easier to understand what was changed and why.
Let me know if there are any questions, happy to explain each line change in detail.