Election-Tech-Initiative / electionguard-python

A python module implementing the ElectionGuard specification. This implementation can be used to conduct End-to-End Verifiable Elections as well as privacy-enhanced risk-limiting audits.
https://www.electionguard.vote/
MIT License
161 stars 96 forks source link

🐞 HashedElGamal is not compliant with the spec #646

Open danwallach opened 2 years ago

danwallach commented 2 years ago

Is there an existing issue for this?

Current Behavior

The code in hmac.py (in particular, the get_hmac() function) is being used as part of HashedElGamalCiphertext for two very different purposes.

For the first purpose, computing an HMAC, it seems to be correct. Example, where the spec says to use HMAC (e.g., $c_2 = HMAC(k_0, c_0 | c_1)$), the code says:

mac = get_hmac(mac_key, to_mac)

This is fine.

The ElectionGuard spec notes that the actual encryption of the message ($c_1 = m_1 \oplus k_1 | m_2 \oplus k2 | \cdots | m{bm} \oplus k{b_m}$) is supposed to use a NIST 800-108-compliant key derivation function (KDF). The Python code is not compliant with the NIST spec. In particular, the code that creates those $k_i$ values does this:

        data_key = get_hmac(
            session_key.to_hex_bytes(),
            encryption_seed.to_hex_bytes(),
            bit_length,
            (i + 1),
        )

So, what's going on inside get_hmac? Its input to the HMAC function is ultimately the byte concatenation of three values: start_byte + msg + end_byte where start_byte is the index $i$, and end_byte is the message length. The NIST spec says:

K(i) := PRF (KI, [i] || Label || 0x00 || Context || [L])

Suggested fixes below.

Expected Behavior

It's confusing to use get_hmac to serve two radically different purposes. First and foremost, it would be good to have a KDF class that implements the NIST spec correctly and then use it to compute the $k_i$'s. Anyway, even if you don't do that, there are a few things that you do need to do:

Steps To Reproduce

No response

Environment

No response

Anything else?

No response

danwallach commented 2 years ago

FWIW, in both our Kotlin and TypeScript code, we've implemented a UInt256 class, as distinct from arbitrary bigints or ElementModQ. This turns out to make everything involving hash and MAC computation much more straightforward. We then have helper methods to convert from UInt256 to ElementModQ and back again when necessary.

rc-ms commented 2 years ago

Thanks @danwallach this is helpful. We've discussed and will certainly adopt these or variations for 2.0. We will unlikely be able to accommodate them for 1.0.