stef / pysodium

wrapper for libsodium providing high level crypto primitives
Other
130 stars 50 forks source link

crypto_secretstream* functions? #70

Closed robehickman closed 7 years ago

robehickman commented 7 years ago

Do you intend to support the crypto_secretstream* functions, or is there another straightforward way to encrypt streams and large files without having the whole thing in memory?

stef commented 7 years ago

sure would be a nice addition. patches are very welcome.

robehickman commented 7 years ago

I'm willing to look at this later as long as you aren't aware of anyone working on it already.

stef commented 7 years ago

no one informed me of such plans so far.

robehickman commented 7 years ago

I'm new to ctypes, What is the correct way of handling a struct?

typedef struct crypto_secretstream_xchacha20poly1305_state { unsigned char k[crypto_stream_chacha20_ietf_KEYBYTES]; unsigned char nonce[crypto_stream_chacha20_ietf_NONCEBYTES]; unsigned char _pad[8]; } crypto_secretstream_xchacha20poly1305_state;

jedisct1 commented 7 years ago

Maybe something similar to what was done for other functions using a state, such as that one:

def crypto_generichash_init(outlen=crypto_generichash_BYTES, k=b''):
    state = ctypes.create_string_buffer(sodium.crypto_generichash_statebytes())
    __check(sodium.crypto_generichash_init(ctypes.byref(state), k, ctypes.c_size_t(len(k)), ctypes.c_size_t(outlen)))
    return state

So, don't worry about the content of the structure, and just allocate a string buffer of the required size.

jedisct1 commented 7 years ago

And for retrieving the tag in secretstream_pull, it's probably something like

tag = ctypes.c_uchar()
__check(sodium...(... ctypes.byref(tag), ...))
jedisct1 commented 7 years ago

(I don't know much about ctypes either, though. Just browsing through similar functions in pysodium's source code)

robehickman commented 7 years ago

Thanks, another ways seems to be like this, which is used for one of the structs in the existing code. I don't know if there is any benefit of doing this though.

class crypto_secretstream_xchacha20poly1305_state(ctypes.Structure):
    _pack_ = 1
    _fields_ = [
        ('k', ctypes.c_ubyte * crypto_stream_chacha20_ietf_KEYBYTES),
        ('nonce', ctypes.c_ubyte * crypto_stream_chacha20_ietf_NONCEBYTES),
        ('_pad', ctypes.c_ubyte * 8)]
stef commented 7 years ago

what jedisct1 says is correct, just use an opaque buffer, no need to fiddle with elaborate structs.

jedisct1 commented 7 years ago

I would recommend treating these structures as opaque. Their actual content may change or be shuffled around, as long as the total size doesn't change. Applications and bindings generally don't need to access the fields directly.

robehickman commented 7 years ago

Good point.

robehickman commented 7 years ago

When returning a key, some of the existing functions return the ctypes object while others return *.raw, what is correct in this instance?

# void crypto_secretstream_xchacha20poly1305_keygen (unsigned char k[crypto_secretstream_xchacha20poly1305_KEYBYTES])
@sodium_version(1, 0, 14)
def crypto_secretstream_xchacha20poly1305_keygen():
    key_result = ctypes.create_string_buffer(crypto_secretstream_xchacha20poly1305_KEYBYTES)
    sodium.crypto_secretstream_xchacha20poly1305_keygen(ctypes.byref(key_result))
    return key_result
stef commented 7 years ago

in this case it seems there is no error handling necessary. i'd return key_result.raw

jedisct1 commented 7 years ago

It key useful, or used at all here?

stef commented 7 years ago

hm good point jedisct1. i don't see a need for key as param

robehickman commented 7 years ago

Just noticed that, I haven't used C for about 7 years and forgot about the 'return via pointer passed as a parameter' design pattern. I did edit my comment above when I noticed.

robehickman commented 7 years ago

I am unsure as to weather cypertext needs to be passed by reference. In the existing function "crypto_aead_chacha20poly1305_encrypt" cyphertext is passed by value, but clen is passed by reference. I would expect it to be a referance as it is a pointer to buffer to be filled by libsodium.

@sodium_version(1, 0, 15)
def crypto_secretstream_xchacha20poly1305_push(state, message, ad, tag):
    mlen = ctypes.c_ulonglong(len(message))
    adlen = ctypes.c_ulonglong(len(ad)) if ad is not None else ctypes.c_ulonglong(0)
    cyphertext_len = mlen.value + crypto_secretstream_xchacha20poly1305_ABYTES
    cypertext = ctypes.create_string_buffer(cyphertext_len)

    __check(sodium.crypto_secretstream_xchacha20poly1305_push(
                                                                ctypes.byref(state)     #  crypto_secretstream_xchacha20poly1305_state *state,
                                                                ctypes.byref(cypertext) #  unsigned char *out
                                                                cyphertext_len,         #  unsigned long long *outlen_p,
                                                                message,                #  const unsigned char *m,
                                                                mlen,                   #  unsigned long long mlen,
                                                                ad,                     #  const unsigned char *ad,
                                                                adlen,                  #  unsigned long long adlen,
                                                                tag)                    #  unsigned char tag)

    return cypertext.raw
jedisct1 commented 7 years ago

Look at existing, similar functions, such as crypto_aead_chacha20poly1305_encrypt.

The ciphertext is named c and since it's already a pointer (that's more or less what ctypes.create_string_buffer() returns), there's no need for an extra .byref.

Try keeping the same naming conventions (c for the ciphertext) or else, it may work, but @stef will be grumpy.

jedisct1 commented 7 years ago

(or not... I don't know what makes @stef grumpy. But if it were in a project of mine, that would make me grumpy).

stef commented 7 years ago

yeah, no byref needed for the cyphertext, but as jedisct1 says, i'd prefer to stick to the naming convention of libsodium, if jedisct1 says something there is called c it should also be so in pysodium, so if someone looks at the code there's immediate associations between py and the c original.

stef commented 7 years ago

but clen is passed by reference.

is because libsodium calculates and returns the final size in this parameter, thus byref in that case, even if this calculation is kinda trivial.

robehickman commented 7 years ago

That's just me working out what all of the parameters are for, as there are quite a lot of them. I'll change the naming and formatting later.

stef commented 7 years ago

btw unsigned long long *outlen_p, being a pointer seems to suggest that this is indeed an output parameter where the final size is returned.

robehickman commented 7 years ago

Do the *pull functions need to check the length of the inputs to prevent buffer overflow?

Edit: This comment was based on a flawed understanding of what the additional data was.

robehickman commented 7 years ago

Also I am unsure of the function of the 'additional data'. This appears to be arbitrary additional data which is encrypted and stored in the cyphertext, which can then be extracted from it again during decryption. If this is how it works then during encryption the length of the target buffer would need to account for this additional data. The same length would have to be allocated during decryption which I'd guess sodium would fill, as it does the message buffer.

edit:

This suggests that the 'additional data' isn't included in the cyphertext, but is added to the main data before it is signed, and the same data needs to be provided for decryption.

https://crypto.stackexchange.com/questions/22806/additional-data-in-aead-chacha20-poly1305-libsodium

robehickman commented 7 years ago

Here are my changes so far:

diff --git a/pysodium/__init__.py b/pysodium/__init__.py
index 75e38d9..b0d3bce 100755
--- a/pysodium/__init__.py
+++ b/pysodium/__init__.py
@@ -138,6 +138,7 @@ crypto_hash_sha512_BYTES = sodium.crypto_hash_sha512_bytes()
 crypto_aead_chacha20poly1305_KEYBYTES = sodium.crypto_aead_chacha20poly1305_keybytes()
 crypto_aead_chacha20poly1305_NONCEBYTES = sodium.crypto_aead_chacha20poly1305_npubbytes()
 crypto_aead_chacha20poly1305_ABYTES = sodium.crypto_aead_chacha20poly1305_abytes()
+
 if sodium_version_check(1, 0, 9):
     crypto_aead_chacha20poly1305_ietf_KEYBYTES = sodium.crypto_aead_chacha20poly1305_ietf_keybytes()
     crypto_aead_chacha20poly1305_ietf_NONCEBYTES = sodium.crypto_aead_chacha20poly1305_ietf_npubbytes()
@@ -152,11 +153,24 @@ if sodium_version_check(1, 0, 9):
     crypto_pwhash_ALG_DEFAULT = sodium.crypto_pwhash_alg_default()
 else:
     crypto_pwhash_ALG_DEFAULT = None
+
 if sodium_version_check(1, 0, 12):
     crypto_kx_PUBLICKEYBYTES = sodium.crypto_kx_publickeybytes()
     crypto_kx_SECRETKEYBYTES = sodium.crypto_kx_secretkeybytes()
     crypto_kx_SESSIONKEYBYTES = sodium.crypto_kx_sessionkeybytes()

+if sodium_version_check(1, 0, 15):
+    crypto_secretstream_xchacha20poly1305_STATEBYTES = sodium.crypto_secretstream_xchacha20poly1305_statebytes()
+    crypto_secretstream_xchacha20poly1305_ABYTES = crypto_secretstream_xchacha20poly1305_abytes()
+    crypto_secretstream_xchacha20poly1305_HEADERBYTES = crypto_secretstream_xchacha20poly1305_headerbytes()
+    crypto_secretstream_xchacha20poly1305_KEYBYTES = sodium.crypto_secretstream_xchacha20poly1305_keybytes()
+    crypto_secretstream_xchacha20poly1305_MESSAGEBYTES_MAX = crypto_secretstream_xchacha20poly1305_messagebytes_max()
+    crypto_secretstream_xchacha20poly1305_TAG_MESSAGE = crypto_secretstream_xchacha20poly1305_tag_message()
+    crypto_secretstream_xchacha20poly1305_TAG_PUSH = crypto_secretstream_xchacha20poly1305_tag_push()
+    crypto_secretstream_xchacha20poly1305_TAG_REKEY = crypto_secretstream_xchacha20poly1305_tag_rekey()
+    crypto_secretstream_xchacha20poly1305_TAG_FINAL = crypto_secretstream_xchacha20poly1305_tag_final()
+
+
 class CryptoSignState(ctypes.Structure):
     _pack_ = 1
     _fields_ = [
@@ -465,6 +479,124 @@ def crypto_box_open_detached(c, mac, nonce, pk, sk):
     __check(sodium.crypto_box_open_detached(msg, c, mac, ctypes.c_ulonglong(len(c)), nonce, pk, sk))
     return msg.raw.decode()

+
+
+
+
+
+
+################################
+
+
+# void crypto_secretstream_xchacha20poly1305_keygen (unsigned char k[crypto_secretstream_xchacha20poly1305_KEYBYTES])
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_keygen():
+    key_result = ctypes.create_string_buffer(crypto_secretstream_xchacha20poly1305_KEYBYTES)
+    sodium.crypto_secretstream_xchacha20poly1305_keygen(ctypes.byref(key_result))
+    return key_result
+
+
+# int crypto_secretstream_xchacha20poly1305_init_push(crypto_secretstream_xchacha20poly1305_state *state,
+#                                                     unsigned char out[crypto_secretstream_xchacha20poly1305_HEADERBYTES],
+#                                                     const unsigned char k[crypto_secretstream_xchacha20poly1305_KEYBYTES])
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_init_push(key):
+    if key == None):
+        raise ValueError("invalid parameters")
+
+    state  = ctypes.create_string_buffer(crypto_secretstream_xchacha20poly1305_STATEBYTES)
+    header = ctypes.create_string_buffer(crypto_secretstream_xchacha20poly1305_HEADERBYTES)
+    __check(sodium.crypto_secretstream_xchacha20poly1305_init_push(ctypes.byref(state), ctypes.byref(header), ctypes.byref(key)))
+    return state
+
+# int crypto_secretstream_xchacha20poly1305_init_pull(crypto_secretstream_xchacha20poly1305_state *state,
+#                                                     const unsigned char in[crypto_secretstream_xchacha20poly1305_HEADERBYTES],
+#                                                     const unsigned char k[crypto_secretstream_xchacha20poly1305_KEYBYTES])
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_init_pull(header, key):
+    if None in (header, key):
+        raise ValueError("invalid parameters")
+    state  = ctypes.create_string_buffer(crypto_secretstream_xchacha20poly1305_STATEBYTES)
+    __check(sodium.crypto_secretstream_xchacha20poly1305_init_pull(ctypes.byref(state), header, key)
+    return state
+
+#void crypto_secretstream_xchacha20poly1305_rekey (crypto_secretstream_xchacha20poly1305_state *state)
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_rekey(state):
+    if state == None):
+        raise ValueError("invalid parameters")
+    sodium.crypto_secretstream_xchacha20poly1305_rekey(ctypes.byref(state))
+
+#int crypto_secretstream_xchacha20poly1305_push (crypto_secretstream_xchacha20poly1305_state *state,
+#                                                unsigned char *out,
+#                                                unsigned long long *outlen_p,
+#                                                const unsigned char *m,
+#                                                unsigned long long mlen,
+#                                                const unsigned char *ad,
+#                                                unsigned long long adlen,
+#                                                unsigned char tag)
+
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_push(state, message, ad, tag):
+    if None in (state, message, ad, tag):
+        raise ValueError("invalid parameters")
+    mlen = ctypes.c_ulonglong(len(message))
+    adlen = ctypes.c_ulonglong(len(ad)) if ad is not None else ctypes.c_ulonglong(0)
+    clen = mlen.value + crypto_secretstream_xchacha20poly1305_ABYTES
+    c = ctypes.create_string_buffer(clen)
+
+    __check(sodium.crypto_secretstream_xchacha20poly1305_push(
+                                                                ctypes.byref(state)           #  crypto_secretstream_xchacha20poly1305_state *state,
+                                                                c,                            #  unsigned char *out
+                                                                ctypes.byref(clen), #  unsigned long long *outlen_p,
+                                                                message,                #  const unsigned char *m,
+                                                                mlen,                   #  unsigned long long mlen,
+                                                                ad,                     #  const unsigned char *ad,
+                                                                adlen,                  #  unsigned long long adlen,
+                                                                tag)                    #  unsigned char tag)
+    return c.raw
+
+
+#crypto_secretstream_xchacha20poly1305_pull (crypto_secretstream_xchacha20poly1305_state *state,
+#                                            unsigned char *m,
+#                                            unsigned long long *mlen_p,
+#                                            unsigned char *tag_p,
+#                                            const unsigned char *in,
+#                                            unsigned long long inlen,
+#                                            const unsigned char *ad,
+#                                            unsigned long long adlen)
+@sodium_version(1, 0, 15)
+def crypto_secretstream_xchacha20poly1305_pull(state, ciphertext):
+    if None in (state, ciphertext):
+        raise ValueError("invalid parameters")
+
+    m = ctypes.create_string_buffer(len(ciphertext) - crypto_secretstream_xchacha20poly1305_ABYTES)
+    mlen = ctypes.c_ulonglong(0)
+    tag  = ctypes.c_ubyte(0)
+    clen = ctypes.c_ulonglong(len(ciphertext))
+    ad = 0
+    adlen = 0
+
+    __check(sodium.crypto_secretstream_xchacha20poly1305_pull(
+                                                                ctypes.byref(state), 
+                                                                m,          # char *m,
+                                                                mlen,       # long long *mlen_p,
+                                                                tag,        # char *tag_p,
+                                                                ciphertext, # unsigned char *in,
+                                                                clen,       # long long inlen,
+                                                                ad,         # unsigned char *ad,
+                                                                adlen       # long long adlen)
+                                                                )
+    return m.raw
+
+
+
+
+
+################################
+
+
+
 def crypto_sign_keypair():
     pk = ctypes.create_string_buffer(crypto_sign_PUBLICKEYBYTES)
     sk = ctypes.create_string_buffer(crypto_sign_SECRETKEYBYTES)
jedisct1 commented 7 years ago

"additional data" represents data to be included in the computation of the authentication tag. It's not part of the ciphertext.

robehickman commented 7 years ago

OK, so it's essentially a salt then?

stef commented 7 years ago

no it is authenticated data that is/can not be encrypted. like for example src/dest ip addresses, or from: to: headers in emails

stef commented 7 years ago

but the unencrypted data is also protected by the mac and if it is corrupt then the decryption fails.

robehickman commented 7 years ago

I see.

stef commented 7 years ago

would you check this in and link to your repo please? github facilitates reviews much better than in this format that you posted. thanks a lot, looks promising :)

robehickman commented 7 years ago

Sure, I'm currently trying to test it, already fixed a bunch of syntax errors. Experiencing difficulty updating the version of libsoddium. python is still finding the old version despite having just compiled the latest.

jedisct1 commented 7 years ago

Check that you don't have multiple versions installed, such as one in /usr and another one in /usr/local ... That's a classic :)

robehickman commented 7 years ago

That was the problem.

robehickman commented 7 years ago

https://github.com/robehickman/pysodium/commit/5e4bbc42138c04569f6fc01b3d3df5324f49bd90

As noted in the commit message there is some error causing crypto_secretstream_xchacha20poly1305_pull to raise ValueError.

stef commented 7 years ago

great! looks promising, i commented a bit on your code, and would recommend you to implement also some testcases. i think you're almost there ;)

robehickman commented 7 years ago

Do you want the functions with many parameters to remain split over multiple lines?

stef commented 7 years ago

i have no opinion on that tbh

jedisct1 commented 7 years ago

Almost there!

robehickman commented 7 years ago

No problem.

Existing code:

def crypto_generichash_init(outlen=crypto_generichash_BYTES, k=b''):
    state = ctypes.create_string_buffer(sodium.crypto_generichash_statebytes())
    __check(sodium.crypto_generichash_init(ctypes.byref(state), k, ctypes.c_size_t(len(k)), ctypes.c_size_t(outlen)))
    return state

Returns state, rather than state.raw

stef commented 7 years ago

yeah i saw that, that i need to fix. also in 2 more functions i think

robehickman commented 7 years ago

How do you run the test suite? Nosetests isn't detecting them.

https://github.com/robehickman/pysodium/commit/295b6c0df1503119998220ac4d2e7127092af88e

stef commented 7 years ago

i use python -m unittest discover --start-directory test

robehickman commented 7 years ago

Thanks, I'm getting a fail on one of the existing functions:

Traceback (most recent call last):
  File "/home/a/VCS/github/3rd_party/pysodium/test/test_pysodium.py", line 222, in test_crypto_pwhash
    self.assertEqual(binascii.hexlify(out), b'79db3095517c7358449d84ee3b2f81f0e9907fbd4e0bae4e0bcc6c79821427dc')
AssertionError: '001c3cf659faf20faed26e1e02f2bd5a4cb248e7b42fa99925362c724e2b7cc4' != '79db3095517c7358449d84ee3b2f81f0e9907fbd4e0bae4e0bcc6c79821427dc'
robehickman commented 7 years ago

Basic tests passing, they could be made more specific:

https://github.com/robehickman/pysodium/commit/d68c272156ae4109589d5a8784cc0874eb26dafb

I haven't attempted to fix the failure noted above.

jedisct1 commented 7 years ago

The failing test you get is unrelated.

crypto_pwhash() supports multiple algorithms. You can pass an explicit algorithm identifier such as crypto_pwhash_ALG_ARGON2I13, or use crypto_pwhash_ALG_DEFAULT, which means "use the latest and greatest".

crypto_pwhash_ALG_DEFAULT can change. Applications should store the algorithm identifier along with other parameters, so even if the default changes, they can still recompute the same hash later. And if they use ALG_DEFAULT for new hashes, they can automatically use new algorithms without changing their own code.

pysodium uses ALG_DEFAULT by default. In fact, it lacks definitions for other identifiers.

So, the output is not deterministic. It depends on what ALG_DEFAULT is, which depends on the libsodium version installed.

Long story short, it the hash used in the test was computed using Argon2i, which is very likely, instead of ALG_DEFAULT, the algorithm identifier should be set to crypto_pwhash_ALG_ARGON2I13.

jedisct1 commented 7 years ago

Should be fixed by https://github.com/stef/pysodium/pull/71

robehickman commented 7 years ago

added 'byref' to tag as noted by jedisct1 above

https://github.com/robehickman/pysodium/commit/09e4c6f73fb888ab0b3c970ebef7ea50731f6ccb

stef commented 7 years ago

nice progress. some remarks: please remove the .gitignore you checked in from the repo. use if not pysodium.sodium_version_check(1, 0, 15): return in the testcases to guard against failing tests when libsodium is not recent enough. write a testcase where you encrypt and decrypt and verify the output is correct. also perhaps a testcase where the ad is modified and the decryption fails. for examples see e.g.: test_crypto_box_seal