paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

upgrade schnorrkel to 0.10 #10048

Open gilescope opened 2 years ago

gilescope commented 2 years ago

we are a few versions behind and this gets us to rand 0.8.

linked to upgrade substrate-bip39 crate.

gilescope commented 2 years ago

Previous upgrade notes: https://github.com/paritytech/substrate/issues/3243

nazar-pc commented 2 years ago

I saw #10025, is there a good reason schnorrkel wasn't updated yet?

burdges commented 2 years ago

It's likely some computational dependency that caused the regression, so curve2551-dalek or merlin or rand, since not much else changed. It could be tested vs schnorrkel 10.1 to detect when the regression happened.

A priori, we do not use rand enough to have much impact, except I switched from ThreadRng or OsRng in https://github.com/w3f/schnorrkel/commit/c333370c164746c6e4c2e74fd9b7a53c72dd10f9 and https://github.com/w3f/schnorrkel/commit/f155c2a8000bd79e2800cd4c5b7cd07e13d08f70 which likely slows down signing. Any idea if the regression comes from verification or signing? We could test with this change reverted easily I guess.

I always prefer transcript-like hash functions like merlin when implementing Fiat-Shamir protocols, but merlin is based upon Keccak STROBE and slower than blake2b. In substrate, we typically pass big messages through blake2b before signing, but perhaps somewhere we pass some large-ish message into merlin directly, so then a merlin regression could trigger a noticeable substrate regression. I donno..

It's more likely some curve25519-dalek regression caused this regression somehow, because elliptic curve arithmetic runs way slower than hashing. I've merged everything from @gilescope into 10.2 except https://github.com/w3f/schnorrkel/commit/5334cb68beb7397278f095f2b3d23371fd7a056f but since then we've decided to drop the -ng versions, since not even zcash uses them, so we should see if this commit improves or worsens things relative to 9.2.

We'll definitely observe some speedup from non -ng versions because others besides HdV and Isis contributes patches like https://github.com/dalek-cryptography/curve25519-dalek/pull/379 although that specific one never got merged. We therefore risk some mixed bag here regardless.

We should maybe recheck that no dependency uses SigningContext::hash512 due to https://github.com/w3f/schnorrkel/commit/55270595ae72412aaf82fe780ca0de59326fe2d9 but anything like that should not be in the substrate ecosystem

nazar-pc commented 2 years ago

I see there is a verification bench in schnorrkel, did you use that to compare 0.9 to 0.10? Maybe other benches like signing should be added as well?

burdges commented 2 years ago

Yes likely so, but verification is by far the most important benchmark.

I donno how big the substrate regression was either.