AntonKueltz / fastecdsa

Python library for fast elliptic curve crypto
https://pypi.python.org/pypi/fastecdsa
The Unlicense
263 stars 76 forks source link

How to replicate signatures from other implementations / verify correctness? #46

Closed thejohnfreeman closed 4 years ago

thejohnfreeman commented 4 years ago

I'm looking for a Python library for ECDSA. I need to match the output of projects in other languages using different libraries for ECDSA. I took some known (key, message, signature) "test vectors" from this ECDSA library in Go and was able to replicate them (or at least the first and fourth) with elliptic in JavaScript, to verify that they work the same. However, I can't seem to do the same with fastecdsa.

How would you write a function that, given two 32-byte bytes values for both the key and the message digest, returns a DER-encoded bytes signature? I tried this (paraphrased):

from fastecdsa import curve, ecdsa

def sign(private_key_bytes: bytes, message_digest_bytes: bytes) -> bytes:
    private_key = int.from_bytes(private_key_bytes, byteorder='big')
    r, s = ecdsa.sign(
        message_digest_bytes.hex(), private_key, curve=curve.secp256k1, prehashed=True
    )
    r_bytes = r.to_bytes(32, byteorder='big')
    s_bytes = s.to_bytes(32, byteorder='big')
    # Hacky hard-coded DER encoding that works for the two cases I'm testing.
    return b'\x30\x44' + b'\x02\x20' + r_bytes + b'\x02\x20' + s_bytes

You can see it failing on the two test vectors I copied from the Go project.

AntonKueltz commented 4 years ago

Have you tried using the fastecdsa.encoding.der.DEREncoder class, specifically the encode_signature(r, s) method? The resulting byte encoding should conform with RFC2459.

I'm also not sure message_hash_bytes.hex() is correct, I guess it depends on if the go test vector you're using is a hex encoding of bytes (this would be my guess) or if it's a utf-8 encoded string that happens to only be characters that are also valid hex values. If it's the former you could do binascii.unhexlify(hexstring) to get the raw bytes.

thejohnfreeman commented 4 years ago

Good point on the encoder. I missed that, but added it now. Now my sign function looks like this:

def sign(private_key_bytes: bytes, message_hash_bytes: bytes) -> bytes:
    private_key = int.from_bytes(private_key_bytes, byteorder='big')
    r, s = ecdsa.sign(
        message_hash_bytes.hex(),
        private_key,
        curve=curve.secp256k1,
        prehashed=True,
    )
    return DEREncoder.encode_signature(r, s)

The reason for the call to .hex() is because I peeked behind the curtain to see how the prehashed parameter is handled. When False, the sign method passes a .hexdigest() to the C FFI, so I figured I should match that. If I try to pass the bytes directly, I get a type error:

msg = b'\x9eWU\xec/2\x8c\xc8cZUA]\x0e\x9a\t\xc2\xb6\xf2\xc9\xb04<\x94_\xbb\xfe\x08$zL\xbe'
d = 224606400986705160...1487958639159536557, curve = secp256k1
hashfunc = <built-in function openssl_sha256>, prehashed = True

    def sign(msg: MsgTypes, d: int, curve: Curve = P256, hashfunc=sha256, prehashed: bool = False):
        """Sign a message using the elliptic curve digital signature algorithm.

        The elliptic curve signature algorithm is described in full in FIPS 186-4 Section 6. Please
        refer to http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf for more information.

        Args:
            |  msg (str|bytes|bytearray): A message to be signed.
            |  d (int): The ECDSA private key of the signer.
            |  curve (fastecdsa.curve.Curve): The curve to be used to sign the message.
            |  hashfunc (_hashlib.HASH): The hash function used to compress the message.
        """
        # generate a deterministic nonce per RFC6979
        rfc6979 = RFC6979(msg, d, curve.q, hashfunc)
        k = rfc6979.gen_nonce()

        # add a prehash option
        if not prehashed:
            hashed = hashfunc(msg_bytes(msg)).hexdigest()
        else:
            hashed = msg

        r, s = _ecdsa.sign(
            hashed,
            str(d),
            str(k),
            str(curve.p),
            str(curve.a),
            str(curve.b),
            str(curve.q),
            str(curve.gx),
>           str(curve.gy)
        )

E       TypeError: argument 1 must be str, not bytes
../../../virtualenv/python3.7-dev/lib/python3.7/site-packages/fastecdsa/ecdsa.py:48: TypeError
AntonKueltz commented 4 years ago

This looks like a legitimate bug with bytes being passed as pre hashed messages. I’ll need to take a look at the C bindings, they should really be operating on bytes and not strings. Will try to work the issue this weekend.

AntonKueltz commented 4 years ago

I fixed the pre-hashing issue but there's still an issue with the DER encoder generating bad ASN.1 so I'll keep this open. I'll have to look at the implementation, I think that code was from a PR. The tests for the implementation pass but they may not actually be valid, there's some odd corner cases around the most significant bit of the most significant byte being 1 which seem to be trying to handle signed integers (but we only work with unsigned integers since all negative numbers have an equivalent positive number in the field we're working in)

On a side note, I strongly dislike exposing pre-hashing in the interface because it confuses the interface massively. What I mean is it's not clear what the function expects for a pre-hashed message... is it bytes? a string? hex encoded (like most hash digests are commonly shown)? an integer? It's not clear because ECDSA is not defined for pre-hashed messages. I may actually end up removing support for this because it's confusing and doesn't have many legitimate uses (just sign the message and indicate the hash algorithm).

thejohnfreeman commented 4 years ago

From what I've read, DER encodes all integers as signed integers; it does not distinguish. So if one wants to encode an unsigned integer with a most significant 1 bit, one must prefix it with a zero byte, extending its length by 1.

I'm fine if the prehashed option is removed, as long as the hashfunc is overridable. In some places, I use a custom IdentityHash that returns the "message" unchanged, so I can still pass a digest to the signing function.

AntonKueltz commented 4 years ago

Sorry for the late reply. You're correct, unsigned integers need a leading 0 bit. I'll look into making sure the DER encoding is actually conforming to test vectors. These should be part of the test suite to begin with, seems like the overall encoding tested against known values got missed as a case when initially adding DER encoding.

Overriding hashfunc could be tricky since it is also used by the nonce generation. At a minimum an identity hash function would need to implement the same interface as all the other hash functions in hashlib so as to not have issues with undefined methods being called.

thejohnfreeman commented 4 years ago

Indeed. You can see my IdentityHash here.

AntonKueltz commented 4 years ago

I probably should have checked this a lot earlier - The root of the issue is that the raw r and s values of the signatures differ between the libraries. So the mismatch is occurring pre encoding. Do you know if the libraries you're using are using RFC6979 to generate nonces and using sha256 as the hash for the HMAC in that algorithm?

thejohnfreeman commented 4 years ago

The libraries I've been testing are fastecdsa, ecdsa, and cryptography (and I have eyes on starkbank-ecdsa and ecpy). You can see the uniform interface I built for each as test fixtures, and the tests using those fixtures.

ecdsa says it uses RFC6979 for the method sign_deterministic, which is the method I call. Reading the source, it uses the same hash function for both the message digest and the nonce, but I need to use a custom hash for the digest (first half of SHA-512). I guess I'll have to call sign_digest_deterministic with a digest I compute for myself, passing SHA-256 for the nonce hash to match fastecdsa. Is it possible for fastecdsa to use the same hash for both? I don't believe RFC6979 mandates SHA-256, though its example does match the behavior of ecdsa:

HMAC [RFC2104] is a construction of a Message Authentication Code using a hash function and a secret key. Here, we use HMAC with the same hash function H as the one used to process the input message prior to signature generation or verification.

cryptography doesn't mention RFC6979 or determinism at all.

thejohnfreeman commented 4 years ago

I added a test after changing my call to sign_digest_deterministic with SHA-256, and it still fails to match fastecdsa.

AntonKueltz commented 4 years ago

Still looking into this, haven’t had too much success root causing.

AntonKueltz commented 4 years ago

Finally figured this one out, issue is with prehashed values. RFC6979 indicates that the message passed to it should be hashed, but prehashed ECDSA interprets this as meaning only if the message is prehashed, so the step isn't applied for prehashed signs.

This ended up breaking everything since k is generated "incorrectly", and everything afterwards is wrong as a result.

AntonKueltz commented 4 years ago

Fixed in 033ab59f9b5672015a2c59b8c58c5ff4299964ec. Closing this out, let me know if you continue to have issues.

thejohnfreeman commented 4 years ago

I install fastecdsa through PyPI; will you be publishing a new version with that change?

AntonKueltz commented 4 years ago

It's going to be a bit until the next release (I don't have access to the machine with my signing keys for the time being). But yes, this will be published to pypi in the next release.

AntonKueltz commented 4 years ago

@thejohnfreeman v2.1.1 is now available via pypi with this fix. Sorry it took a while to get this diagnosed and implemented,.

thejohnfreeman commented 4 years ago

I can confirm v2.1.2 let me replicate signatures. Thank you!