wasm-signatures / wasmsign2

Implementation of the WebAssembly Modules Signatures.
47 stars 4 forks source link

Ed25519ph as supported signature algorithm #2

Open hslatman opened 8 months ago

hslatman commented 8 months ago

Hey @jedisct1,

I'm toying around using wasmsign and thought it could be nice if Ed25519ph were supported (officially). I'm trying to use it with a system in which there are some cases for which Ed25519 doesn't work, because there's no access to the original signature content bytes in specific code paths, primarily related to signing/verification happening locally vs. remotely.

Based on what's described at https://github.com/WebAssembly/tool-conventions/blob/main/Signatures.md#signature-algorithms-and-key-serialization, additional signature schemes can be supported, but then (at the least) would need some kind of different identifier. Do you think adding Ed25519ph is worth adding to the proposal/documentation?

The current proposal and implementation seem to be doing something similar to prehashing already, although it uses SHA-256, and the "domain specific" context bytes here. Not saying it's exactly the same; just noting some rough similarities.

For the Ed25519ph case, I think it could become something like Ed25519ph(k, sha512(current-hashes-incl-wasmsign-context), context(some-wasmsign-context)). Keeping the current-hashes-incl-wasmsign-context keeps most parts of the verification the same as for Ed25519 with only a small cost in number of bytes. The new context(some-wasmsign-context) could contain the same or other bytes (or just be empty, I guess?).

jedisct1 commented 8 months ago

Hi Herman,

I must confess that I don't follow what Ed25519ph would bring over Ed25519 here, since the message to be signed/verified is guaranteed to be very small. It can fit in memory, and can be seamlessly sent to a HSM. A hash of that message wouldn't be significantly smaller.

There's also an API issue. Implementations of signature schemes often require the message to be signed, not a hash of it, which is done internally.

This is also the case for Ed25519ph, which in addition to that, is not widely implemented. Ed25519ph is also not supported by HSMs, and not part of the WASI crypto extension spec either (the latter can be addressed, but the HSM part would remain an issue).

jedisct1 commented 8 months ago

There's a major difference between what's being done here and Ed25519ph. What is being signed is not a hash of a message. It's a message made of the concatenation of multiple hashes, each representing one or more sections of the WebAssembly module.

hslatman commented 8 months ago

Yes, I agree with your points. Adding Ed25519ph wouldn't bring much in terms of reducing the message size.

The reason I'm looking into supporting Ed25519ph is because I wanted to see if I can make it work with Sigstore; specifically its keyless mode and with the sign-blob method (currently). I've started implementing a Go version of wasmsign, and it currently supports the simple, single hash verification (and creation) flow. It also includes a basic Ed25519 X509 certificate chain validation mechanism to establish which Ed25519 public keys to trust (which needs some mechanism for distributing the certificate, hence Sigstore, but other ways might work too). I've also made some changes to cosign to create a temporary Ed25519 key (instead of the default P-256 key) and it can obtain a certificate for that key from Fulcio successfully already. The part that doesn't work yet is submitting the hash to Rekor, because that blocks Ed25519 public keys, because the original content isn't part of what's submitted and can thus not be verified on submit. There's an open issue to support Ed25519ph, though: https://github.com/sigstore/rekor/issues/1325. The approach seems to work with --tlog-upload=false already, though.

Cosign can already sign arbitrary blobs, and there's also some documentation for using it to sign Wasm here. But those are not directly compatible with the WebAssembly Signature proposal and operate on the full binary blob instead.

So my idea is roughly to be able to do cosign sign-wasm in keyless mode, and for that to be compatible with WebAssembly Signatures. I'm aware it might be a bit of a special case. Just interested in trying out the approach, and totally fine if this doesn't make sense for the bigger picture πŸ™‚

There's a major difference between what's being done here and Ed25519ph. What is being signed is not a hash of a message. It's a message made of the concatenation of multiple hashes, each representing one or more sections of the WebAssembly module.

That's why for Ed25519ph I think Ed25519ph(k, sha512(current-hashes-incl-wasmsign-context), context(some-wasmsign-context)) makes sense. The validation remains the same, operating on the same data that wasmsign currently uses.

hslatman commented 8 months ago

Not entirely related, but is it possible the reference implementation is not fully in line with the example at https://github.com/WebAssembly/tool-conventions/blob/main/Signatures.md#signature-algorithms-and-key-serialization? I was looking at the description and noticed that the signature seems to include the type of key (0x01) for Ed25519 for a total of 65 bytes?

Content of the signature section, for a single signature:

    0x01 (spec_version)
    0x01 (hash_fn)
    1 (signed_hashes_count)
    signed_hashes:
        3 (hashes_count)
        <96 bytes> (hashes=SHA-256(sections 1..12) β€– SHA-256(sections 1..20) β€– SHA-256(sections 1..end))
        1 (signatures_count)
        signature:
            0 (key_id_len - no key ID)
            65 (signature_len)
            <65 bytes> (0x01 β€– Ed22519(k, hashes))

The signature length in a detached signature without key ID I generated using wasmsign says it's length 64, so the 0x01 doesn't seem to be included there: CyberChef recipe. Technically maybe not the most correct way, but I guess this position could be used to denote alternative (non-standardized) signing schemes? Is it a bug in the implementation, or intentionally omitted with the example being out of date?

Screenshot 2023-12-13 at 15 31 29
jedisct1 commented 8 months ago

Good catch. Signatures should include the algorithm prefix (matching the one from the public key).

That was a last minute addition to the spec, and wasmsign2 was indeed not updated to include it. The spec should also document this better than in an example. I will take care of this.

jedisct1 commented 8 months ago

Regarding prehashing, adding support for ECDSA to the spec (or replacing Ed25519 with it) may be better than Ed25519ph.

It is more widely available, and has a higher probably of having APIs allowing to sign a hash. By the way, signing or verifying a hash with plain EdDSA is possible (the message just needs to be prefixed with r followed by the public key), but crypto libraries don't implement it.

Still, relying on the fact that signature systems prehash the data in a specific way doesn't sound ideal. Besides API issues, there are no guarantees that signature schemes will support it. If I recall correctly, post-quantum signature schemes, including Falcon and Dilithium are incompatible with that (there were discussions on the CFRG list about it, but I don't think they went anywhere).

I'm not familiar enough enough with Sigstore, but if it requires signing a message but verifying a hash of that message, that looks like a design issue. It would be better for Sigstore to hash the input itself (regardless of what the signature system does internally) and feed that as the message to the signature function.

hslatman commented 8 months ago

Regarding prehashing, adding support for ECDSA to the spec (or replacing Ed25519 with it) may be better than Ed25519ph.

Yes, after my previous messages I started thinking about going this route instead with cosign. Their defaults (for keyless) are ECDSA with P-256. So I guess it could be an option to either 1) add one of those to the standard algorithms (and maybe mark them as not required?), or 2) just give them an identifier in some specific range for custom implementations (or something like that). I guess for standardization, an assigned identifier makes more sense. And the identifier would then have to indicate the key type as well as the curve, I presume?

It is more widely available, and has a higher probably of having APIs allowing to sign a hash. By the way, signing or verifying a hash with plain EdDSA is possible (the message just needs to be prefixed with r followed by the public key), but crypto libraries don't implement it.

Still, relying on the fact that signature systems prehash the data in a specific way doesn't sound ideal. Besides API issues, there are no guarantees that signature schemes will support it. If I recall correctly, post-quantum signature schemes, including Falcon and Dilithium are incompatible with that (there were discussions on the CFRG list about it, but I don't think they went anywhere).

I'm not familiar enough enough with Sigstore, but if it requires signing a message but verifying a hash of that message, that looks like a design issue. It would be better for Sigstore to hash the input itself (regardless of what the signature system does internally) and feed that as the message to the signature function.

I haven't looked into all bits of Sigstore in detail, but if I understand correctly, at verification time you usually have access to the original binary data (you're verifying a binary or Docker image just before using it, for example; similar to what you'd do for Wasm modules). The part that doesn't support the Ed25519 signature, is Rekor, which only gets the hashed value, and verifies that before adding the signature to the ledger. I'd say Rekor is playing a support role here; it can be omitted from the picture and you'll still be able to verify, but you don't get the benefits of Rekor in that case.

jedisct1 commented 8 months ago

The wasmsign2 implementation was updated to match the specification regarding the signatures, thanks a lot for reporting this!

jedisct1 commented 8 months ago

I have to look at cosign and rekor more closely in order to understand what happens where.

So, Rekor gets you the hash and the signature. In cosign, instead of signing the message, why not sign a hash of the message, then? That would be easy to verify with any Ed25519 implementation.

Rekor supports Minisign, that also uses Ed25519. What's so different here?

hslatman commented 8 months ago

The big difference is making this compatible with cosign keyless mode. And keyless for signing blobs.

Keyless mode uses a temporary key pair (Ed25519 in my hack; but by default it's EC P-256) and requests a certificate from Fulcio. Fulcio authenticates certificate requests using OIDC, and encodes some identifying information into the certificate. The private key is then used to create a signature, and that is recorded to Rekor (incl. the cert, and a timestamp), after which the key is deleted (so, not keyless-keyless, but the nice thing is that the key doesn't need to be kept around). To verify, the verifier obtains the required data from Rekor (haven't tested this part with Ed25519 yet).

EDIT (after thinking about this some more): for verification, the cert + signature are sufficient (e.g. --bundle cosign.bundle), but that doesn't provide the timestamp nor the Rekor inclusion proof (I think?). So to make this work fully, the cert + signature still have to be made available somewhere. The signature is already part of the Wasm signature section; the certificate is not. So to make this scheme work, I guess the cert would have to be added too. Or maybe it can be retrieved from Rekor too? Makes this a slightly bigger thing than maybe needs to be standardized, but who knows πŸ˜….

Note that the Rekor part isn't strictly required, and can be skipped using --tlog-upload=false. With that flag specified, my fork of cosign already works, and spits out a Ed25519 public key cert and signature, and the values seem to check out.

You're right that there are many parts in the Sigstore stack that already support Ed25519 keys, and that it's interoperable with quite a number of key formats and encryption tools, incl. minisign. Of special note, I think, is this PR. But not all these key formats and tools are (currently) part of the keyless flow that I'm trying to use, and for which https://github.com/sigstore/rekor/issues/1325 needs to be addressed.

The Ed25519ph looked close enough to the current signing scheme in wasmsign to be a viable route, so that's why I tried hacking together that first. But I'm not opposed to using EC P-256. I think it would still be nice if some identifier were registered for the EC key type to promote interoperability, but I guess it can also work without (with some more implementation/testing/deployment hurdles).

The wasmsign2 implementation was updated to match the specification regarding the signatures, thanks a lot for reporting this!

Great; thank you! I'll update my Go version πŸ˜„

hslatman commented 8 months ago

The wasmsign2 implementation was updated to match the specification regarding the signatures, thanks a lot for reporting this!

The additional byte now comes right after the key ID, instead of within the X number of bytes.

    0 (key_id_len - no key ID)
    0x01 (Ed25519)
    64 (signature_len)
    <64 bytes> (Ed22519(k, hashes))

As opposed to

    0 (key_id_len - no key ID)
    65 (signature_len)
    <65 bytes> (0x01 β€– Ed22519(k, hashes))

Both options work, and I don't see a strong reason to pick one over the other, as I think both will get the job done. It does need a slight update to the proposal text/schematic, though.

I hacked together a flow, incl. Rekor using ECDSA P-256. I'll use the algorithm identifier to denote a cosign signature scheme. It'll have a certain length, depending on the subtype and the elements included in the signature. That'll include at least the signature and the certificate. I might add the tlog entry ID too, as well as a signed timestamp.

jedisct1 commented 8 months ago

Awwww crap.

That being said, it shows that having the algorithm ID before the signature is more intuitive from an implementation perspective. And it actually makes more sense. The signature is the original signature, not something prefixed.

So, maybe we should update the spec.

hslatman commented 8 months ago

Awwww crap.

That being said, it shows that having the algorithm ID before the signature is more intuitive from an implementation perspective. And it actually makes more sense. The signature is the original signature, not something prefixed.

So, maybe we should update the spec.

Yes, the type before the signature itself totally makes sense πŸ™‚