HashToCurve: What is extra data for?

celo-org / celo-bls-snark-rs

Implements SNARK-friendly BLS signatures

https://celo.org

Apache License 2.0

83 stars 24 forks source link

HashToCurve: What is extra data for? #201

Open ValarDragon opened 3 years ago

ValarDragon commented 3 years ago

The HashToCurve trait's hash function takes as input three strings domain, message, extra data.

Its unclear to me what the purpose of extra data is. At the moment, it acts the same as a message in the try and increment hashes. (here: https://github.com/celo-org/celo-bls-snark-rs/blob/master/crates/bls-crypto/src/hash_to_curve/try_and_increment.rs#L97 and here: https://github.com/celo-org/celo-bls-snark-rs/blob/master/crates/bls-crypto/src/hash_to_curve/try_and_increment_cip22.rs#L92)

Can this be removed from the API from it?

kobigurk commented 3 years ago

The use of it after CIP22 is that the extra data is only part of the outer hash while the message is in the inner hash. This is useful when the inner hash is very expensive, and the extra data can contain data that is relevant only for some clients in the outer hash. For example, we use extra data to include the "epoch number" and the "round number" from consensus, so that we can slash validators on double-signing on-chain, without having to perform the inner hash, it's provided as an opaque blob.

Before CIP22 it's not very useful but we keep it to maintain backwards compatibility at this point. I believe we'll remove it at some point there.

ValarDragon commented 3 years ago

I see (I missed the inner hash in the cip22 logic), totally makes sense for the double hashing paradigm! Is the inner hash relatively expensive natively as compared to signature verification? (Just trying to understand the decision!)

I wonder if it would make most sense if there were two traits, HashToCurve and DoubleHashToCurve, with DoubleHashToCurve capturing that the HashToCurve's dependence on the message is XOF(H(msg)). Then extra data would only be on DoubleHashToCurve, and be renamed to outer_hash_additional_data. Also DoubleHashToCurve could auto-implement HashToCurve.

Is this a change you'd be interested in having? If so I can try to add it. (I wanted to re-use the hash to curve logic implemented here for another project I'm working on)

kobigurk commented 3 years ago

Yep, that inner hashing is very expensive in one of Celo's cases :) The inner hash is a Pedersen hash on 14k bytes (150 validator public keys).

I'd definitely like that contribution, it'll make things nicer!

To be exact, the current hash to curve doesn't add an extra hash compared to Blake2Xs, it's a variant of it. Blake2Xs already has an inner hash, it's just always Blake2s and doesn't support "extra data". So maybe DoubleHashToCurve is not the best name, but I don't have a better idea yet.

ValarDragon commented 3 years ago

Ah, that definitely makes sense for why you're doing it! Thanks for the expl.

Perhaps the name CrhAndXofHashToCurve is more accurate?