pyca / cryptography

cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
https://cryptography.io
Other
6.67k stars 1.53k forks source link

Streaming API for XOFs #9185

Open thomwiggers opened 1 year ago

thomwiggers commented 1 year ago

The SHAKE family of extensible-output-functions are sometimes used as e.g. a deterministic random number generator in the following pattern (with functions named per the sponge nature of Keccak):

# pseudocode
xof = xof.new()
xof.absorb(bytes)
xof.absorb(bytes)
xof.finalize()  # absorb should fail now
ten_bytes_of_output = xof.squeeze(10)
another_1000_bytes = xof.squeeze(1000)

(finalize may be implicit in the first squeeze, note that you usually can't absorb, squeeze, and absorb again without keeping the pre-finalize state).

The current API of shake256 supported by both Python's own hashlib and by cryptography return the same bytes every time you call .digest(len).

References:

N.b. This relates somewhat to #2358, but that one seems more encryption-focused.

reaperhulk commented 1 year ago

Our APIs are limited to what OpenSSL is capable of right now and, as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing. The most recent traffic on that PR suggests they want to change it, but it doesn't look like there's been much traction.

(I am broadly supportive of adding this as soon as OpenSSL allows it or if we can find some other mechanism that isn't ruinous for performance)

h-vetinari commented 11 months ago

as you found in openssl/openssl#7921, OpenSSL can't currently do repeated squeezing

The replacement PR for that got merged for OpenSSL 3.3 about a week ago.

DavidBuchanan314 commented 10 months ago

I just thought I'd share my own wrapper class for anyone else trying to work around this until there's a proper solution. I think mine is marginally more efficient than the dilithium-py wrapper linked above.

class ShakeStream:
    def __init__(self, digestfn) -> None:
        # digestfn is anything we can call repeatedly with different lengths
        self.digest = digestfn
        self.buf = self.digest(32) # arbitrary starting length
        self.offset = 0

    def read(self, n: int) -> bytes:
        # double the buffer size until we have enough
        while self.offset + n > len(self.buf):
            self.buf = self.digest(len(self.buf) * 2)
        res = self.buf[self.offset:self.offset + n]
        self.offset += n
        return res

if __name__ == "__main__":
    from hashlib import shake_128

    a = ShakeStream(shake_128(b"hello").digest)
    foo = a.read(17) + a.read(5) + a.read(57) + a.read(1432) + a.read(48)
    bar = shake_128(b"hello").digest(17 + 5 + 57 + 1432 + 48)
    assert(foo == bar)
reaperhulk commented 9 months ago

Just to put the information in this thread: "The next feature release after OpenSSL 3.2 will be OpenSSL 3.3, which will be released no later than 30 April 2024" (https://www.openssl.org/blog/blog/2023/11/23/OpenSSL32/index.html), so we can look a bit more closely at implementing this support soon-ish.