supranational / blst

Multilingual BLS12-381 signature library
Apache License 2.0
454 stars 171 forks source link

[Rust-binding] Proposal to implement `std::hash::Hash` for publicly exposed structures #197

Closed ok-ul-ch closed 8 months ago

ok-ul-ch commented 8 months ago

At the moment it is not possible to use PublicKey or Signature as keys of HashMap out of the box. I can serialize these structs and use their binary repr as keys, but it adds some code and runtime overhead.

The proposal is to implement std::hash::Hash trait for these structs:

Please let me know if it makes sense for you, I can add them as it doesn't sound like a big deal

dot-asm commented 8 months ago

Being an opponent of doing things for "why-not" reasons [in security code] I have a hard time imagining how it would be useful. I mean hashmap-ing establishes a mapping between pairs of objects, so what kind of lookup would you want to do using a public key as a lookup key? Wouldn't you rather go the other way around, i.e. having the public key on the other side of the pair? Could you elaborate, please? I have to understand :-)

However, it would probably be a hard "no" on the secret key. It can't be a legitimate way to treat secret keys... At least definitely not something we'd encourage by providing an interface...

Another question. Do you expect hash values to be platform independent? Thing is that if you do, then serializing would be the right thing to do. BTW, the question is not really that it incurs an overhead, but rather how it compares to the hashing procedure. If it doesn't have to be platform independent, what would you have in mind? The least significant word of the X coordinate should be adequate hash for EC points... Or do you have something else in mind? If so, could you give an example?

dot-asm commented 8 months ago

Another question. Do you expect hash values to be platform independent?

Depending on the context this might be a nonsense question. Please disregard if it doesn't make sense to you :-)

ok-ul-ch commented 8 months ago

Hi, @dot-asm, my use cases are mainly related to general optimizations, to make sure that I'm not verifying/serializing the same data multiple times.

[...] so what kind of lookup would you want to do using a public key as a lookup key?

Exactly, one of my use cases is maintaining an in-mem cache of public keys with verified proof of possession.

Wouldn't you rather go the other way around, i.e. having the public key on the other side of the pair? Could you elaborate, please? I have to understand :-)

I don't really have other identifiers, so I'm not sure how I can put a public key into the map's value. Anyway, another case is caching of already verified messages, where I'd like to put the whole message into some sort of a HashSet.

However, it would probably be a hard "no" on the secret key. It can't be a legitimate way to treat secret keys...

I definitely agree, didn't think from that perspective when proposing it.

Do you expect hash values to be platform independent?

Not really. I need it in the context of a simple map, and thought about the simple compiler built-in derivation of std::hash::Hash.

The least significant word of the X coordinate should be adequate hash for EC points...

This is exactly why I've opened this issue before making a PR: to clarify if it makes sense to implement Rust's Hash trait with respect to internal BLS math, as I'm not an expert in this area =)

If support of std::hash::Hash doesn't make much sense performance-wise to you, as its overhead may be higher than just using a hash of byte representation - just close this issue)

dot-asm commented 8 months ago

to make sure that I'm not verifying/serializing the same data multiple times.

This sounds like one would use the wire-formatted data as a lookup key, not the blst inner structure. I mean you get a serialized data, look it up in the hashmap, and if there is a match, you pull the associated blst structure and assume it's safe to use. And if not, you uncompress/deserialize the data, vet it to the application requirements, and put the structure into the hashmap as an associated value for the wire-formatted data...

another case is caching of already verified messages, where I'd like to put the whole message into some sort of a HashSet.

Even here, you might be better off hashing things other than blst inner structures. Indeed, if the question is "was this message verified," then you'd have to hash the message. And if the question "was this signature verified," then it would probably be more appropriate to, again, hash the wire-formatted data. I mean what is "this signature" in the context? You probably get it over the network and wonder "do I have to do it all over again," right, in which case using the raw wire-formatted data as a lookup key would spare you uncompression/deserialization (and group-check).

If support of std::hash::Hash doesn't make much sense performance-wise

From the application perspective performance is rather a question of avoiding expensive operations. Uncompress is expensive, because it involves square root operation. Group-check is expensive, and if performed during deserialization, it's a big win if results are cached... But these are solved by hashmap-ing the wire data, not the blst inner structures...

All this is not to say that it makes no sense to provide a hash trait for the inner structures(*), it's just that I don't see the need for one yet, so let's keep thinking... Just in case, we're not a "Rust shop," so it's not like we can offer an authoritative opinion on Rust matters. Nor do we pretend that we think the "Rust way." But we offer our perspective in hope that harmonizing our ways of thinking would be beneficial :-)

(*) Except for the secret key, as already discussed and agreed upon.

ok-ul-ch commented 8 months ago

@dot-asm thanks for your suggestions, I got the general direction where to look at. Unfortunately, I cannot access raw wire-formatted data that easy now (that's why I got here, actually =) ), but will try to solve it this way.

All this is not to say that it makes no sense to provide a hash trait for the inner structures(*), it's just that I don't see the need for one yet, so let's keep thinking...

I think this issue can be closed until someone else faces a similar one)

dot-asm commented 7 months ago

until someone else faces a similar [problem]

This should include even you :-) Well, in your case till you can formulate it clearly, right? As already mentioned/implied, if you want to experiment, least significant word of x coordinate, yes, as is in inner representation, should work as a hash key, since the collisions are unlikely...