verify_prehashed seems unnecessarily strict

rpcesar commented 2 years ago

The function verify_prehashed (ed25519ph impl) requires the prehashed_message to be a full blown Digest implementation, however its only usage within the function is prehashed_message.finalize().as_slice(). In my use case this is highly inconvenient as I ALREADY have a digest in the form [u8; 64] (which I can convert to the GenericArray equiv). As far as I can see there is no way to initialize a Digest with already existing bytes, really hindering the usage for performance scenarios.

Unless I am missing something, I would propose replacing or adding an implementation that takes either a slice, an array, or a GenericArray (which you would get from finalize) as the prehashed_message argument to the function. Alternatively, if there is a way to construct a Digest from raw bytes this could circumvent the issue, but I have not found a way to coerce this behavior.

tarcieri commented 2 years ago

The reasons are likely the same as the ones given in the "Notes" section of the signature crate for its similarly-shaped DigestVerifier trait:

https://docs.rs/signature/latest/signature/trait.DigestVerifier.html#notes

Operating on a raw digest output, rather than an instance of a Digest, permits potential misuses, namely providing a 64-byte value which has not been computed by a cryptographic hash function.

An attacker who is able to pick that value arbitrarily can compute a valid "signature" with the public key alone by solving a system of linear equations.

rpcesar commented 2 years ago

While I can see the benefit of encapsulating the digest to minimize programmer error, I am not sure it follows that carrying around an entire state machine for every digest once computed is very smart from a memory standpoint. Note that this is not behavior enforced by the Digest trait, but by the Sha512 implementation. The digest has an Engine512, which has a long long length, a 64 byte state, and then a block buffer with another usize and its own buffer, all owned by said "digest". Aka its the verb digest not the noun digest. All this means it also serializes poorly which is a pain for edge situations like wasm where multithreading involves marshaling data between webworkers through serialization.

I think a solid compromise would be to have an opaque middle digest struct of the "noun" variety, whos construction is limited to a From/Copy/Clone implementation, immutable on the outside yet tender on the inside, which can be consumed by the api where pre-hashed values are needed. This way you can be reasonably sure that it came from the hash function, yet your not passing around all of mcdonalds when you just need the hamburger.

I forked the repo and make said trivial changes to support my use case (basically unsafe copies of the ph functions which take a slice), but it would be REALLY nice if at least some "unsafe" functions were exposed which would prevent the need to deviate.

tarcieri commented 2 years ago

You can always work around it by defining your own newtype around [u8; 64] which impls Digest

rpcesar commented 2 years ago

Touche! I could make it fit that way, In fact I am already using a newtype around that already, just not implementing the interface. However one downside with that approach is that it would necessitate doing a lot of unimplemented branches, which sucks for coverage AND puts me at the mercy that dalek does not suddenly decide to take the contract for my faux implementation at face. That scares me.

rozbb commented 1 year ago

I see why it's a pain, but I think supporting anything else would be too prone to user error. We couldn't support just fixed-size arrays, because different digests have different sizes, so it'd have to accept arbitrary &[u8]. This already doesn't look great.

I think the above workaround is probably sufficient for motivated end-users with high-performance requirements. Regarding the use of unimplemented!(), it is extremely unlikely that we ever use anything but Digest::finalize() (or whatever that method gets renamed to in future digest versions).

tarcieri commented 1 year ago

We couldn't support just fixed-size arrays, because different digests have different sizes, so it'd have to accept arbitrary &[u8].

FWIW, that's what we ended up going with in the signature crate for similar reasons (namely ECDSA has an entire protocol for zero padding or truncating input digests to fit within a field element when the sizes are mismatched):

https://docs.rs/signature/latest/signature/hazmat/trait.PrehashSigner.html

That said, I haven't proposed or tried implementing the signature crate's digest/prehash traits for ed25519-dalek because they don't have a context parameter. It could be done, though.

rozbb commented 1 year ago

Hmm I didn't realize the trait was already there. Due to the nature of the prehashed signing, it'd actually be possible to have a method like prehash_context(ctx: &[u8]) which does all the partial hashing necessary to expose a PrehashedSigner. No new trait needed.

Your call if we want something like that tho.

tarcieri commented 1 year ago

Sounds like a great idea

isislovecruft commented 1 year ago

I like the PrehashedSigner taking a context initialisation plan. :+1:

dalek-cryptography / ed25519-dalek

verify_prehashed seems unnecessarily strict #190