tink-crypto / tink

Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.
https://developers.google.com/tink
Apache License 2.0
13.47k stars 1.18k forks source link

AES-SIV (EncryptDeterministically) doesn't match OpenSSL and jacobsa/crypto #691

Closed kjmph closed 1 year ago

kjmph commented 1 year ago

We are not deploying Tink right now, I happened to want to test AES-SIV performance and interoperability with Python, C, and Go. As Python and C both rely on OpenSSL 3.1, they get along just fine. However, when using Tink it had some problems sending messages to other services. I tested with github.com/jacobsa/crypto and that project matched OpenSSL just fine and there were no problems.

It turns out, Tink has a bug when there is no associated data being sent with the message. Now, I'm not knowledgeable enough to discuss the finer points of RFC 5297, yet I was able to fix Tink to share messages with OpenSSL. The bug is centered around the s2v function in tink/go/daead/subtle/aes_siv.go. I was able to make this change:

diff --git a/go/daead/subtle/aes_siv.go b/go/daead/subtle/aes_siv.go
index 92b1650d9..743fe291f 100644
--- a/go/daead/subtle/aes_siv.go
+++ b/go/daead/subtle/aes_siv.go
@@ -174,11 +174,13 @@ func (asc *AESSIV) ctrCrypt(siv, in, out []byte) error {
 func (asc *AESSIV) s2v(msg, ad, siv []byte) {
    block := make([]byte, aes.BlockSize)
    asc.cmac(block, block)
-   multiplyByX(block)

-   adMac := make([]byte, aes.BlockSize)
-   asc.cmac(ad, adMac)
-   xorBlock(adMac, block)
+   if len(ad) > 0 {
+       multiplyByX(block)
+       adMac := make([]byte, aes.BlockSize)
+       asc.cmac(ad, adMac)
+       xorBlock(adMac, block)
+   }

    if len(msg) >= aes.BlockSize {
        asc.cmacLong(msg, block, siv)

I'm happy to assist with giving more details, if that's helpful. I presume that y'all would want a test for this too. Anyway, thanks!

kjmph commented 1 year ago

Oh, I don't know if this should be another bug too, but technically, multiple associated data updates should be treated differently than one single byte array. In other words, [][]byte{[]byte("ad1"), []byte("ad2")} is very different from []byte("ad1ad2"). However, that may not be a bug, since that's the Tink API design for EncryptDeterministically, and keeping that design would just remain as a limitation with interoperability with OpenSSL. I just figured I would drop this here, In case it is relevant for future people.

juergw commented 1 year ago

Thanks for pointing this out. I will have a look at this.

juergw commented 1 year ago

First of all, Tink uses an interface that only allows to pass in one AD. So, it does not support the full standard defined in the RFC 5297. Which means that if something is encrypted with two ADs, then there is no way in Tink to decrypt that. I agree that it would be nice for AES-SIV to support multiple ADs, but we can't easily change the API of the deterministic AEAD primitive.

Now, in Tink's interface with exactly one AD, there are two ways to map an empty AD byte string to the AES-SIV standard: it is either an empty list of ADs, or it is a list with one empty byte array. Tink consistently uses the 2nd version, so it is a list with one empty byte array. (Also in C++ and Python it is done that way.) And I think this is a valid way to do this. (Also: it would be impossible to change this, because uses that have previously encrypted data with this would not be able to decrypt with a newer version, which would be very bad.) But we should probably make this clear in the documentation.

kjmph commented 1 year ago

Thanks @juergw, I understand the API limitation, I appreciate the response. Only supporting 1 AD does limit some interoperability with OpenSSL. Yet as you said, it isn't easy to change the API and probably deserves its own issue if it were pursued.

As far as the matter at hand, nil, make([]byte, 0) and []byte{} all produce the same value. Obviously, []byte{0} is a zero byte, so it would be considered an AD byte string. So, I do agree that Tink is consistently choosing to interpret nil as a list with one empty byte array. However, in OpenSSL, there is no way to update the Cipher with a 0 length AD. So, the only way to have interoperability between the two libraries is to use exactly 1 non-zero length AD. For reference, here is the line that demonstrates that OpenSSL limitation:

https://github.com/openssl/openssl/blob/c48cc764ed57e49456d5b90a7d885e8af196df78/crypto/cmac/cmac.c#L168-L169

I was able to match Python and Tink via PyCryptodome by passing in an update for [] which is considered the same as a zero byte array by that library. Note, that library also doesn't match Tink when no AD is used, it matches OpenSSL as well. Yet, I'm going to go forward with the recommendation that if Tink is used, that it must be with an AD, so that everyone can decrypt Tink's messages.

juergw commented 1 year ago

Thanks. I agree that it is unfortunate that OpenSSL and Tink treat empty AD differently. But there is no easy fix for this.

Using a non-empty AD is anyways a good idea.

juergw commented 1 year ago

Could you share some OpenSSL code that I can run that show that you can't add a zero length AD in OpenSSL?

kjmph commented 1 year ago

Sure! Uh.. Is this okay?

#include <openssl/conf.h>
#include <openssl/evp.h>
#include <openssl/err.h>
#include <string.h>

int encryptDeterministically(unsigned char *key,
                             unsigned char *ciphertext,
                             unsigned char *msg, int msg_len,
                             unsigned char *aad, int aad_len)
{
    EVP_CIPHER_CTX *ctx;
    EVP_CIPHER *aes_siv;
    int ciphertext_len;
    int len;

    if(!(ctx = EVP_CIPHER_CTX_new())) {
            ERR_print_errors_fp(stderr);
            return -1;
    }

    if(!(aes_siv = EVP_CIPHER_fetch(NULL, "AES-256-SIV", NULL))) {
            ERR_print_errors_fp(stderr);
            return -1;
    }

    if(EVP_EncryptInit_ex(ctx, aes_siv, NULL, key, NULL) != 1) {
            ERR_print_errors_fp(stderr);
            return -1;
    }

    if (aad != NULL) {
            if(EVP_EncryptUpdate(ctx, NULL, &len, aad, aad_len) != 1) {
                    ERR_print_errors_fp(stderr);
                    return -1;
            }
    } else if (aad_len == 0) {
            /* NEED a valid pointer, 16 zero bytes for illustrative purposes only */
            unsigned char zero[] = {0, 0, 0, 0,
                                    0, 0, 0, 0,
                                    0, 0, 0, 0,
                                    0, 0, 0, 0};
            /* Length is zero though, so this isn't 16 zero bytes, rather a zero byte array */
            if(EVP_EncryptUpdate(ctx, NULL, &len, zero, 0) != 1) {
                    ERR_print_errors_fp(stderr);
                    return -1;
            }
    } else {
            /* DO NOT update cipher */
    }

    if(EVP_EncryptUpdate(ctx, ciphertext+16, &len, msg, msg_len) != 1) {
            ERR_print_errors_fp(stderr);
            return -1;
    }
    ciphertext_len = len;

    /* Final bytes could be written at this point, we sized the
     * ciphertext output to have additional room for extra bytes */
    if(EVP_EncryptFinal_ex(ctx, ciphertext + 16 + len, &len) != 1) {
            ERR_print_errors_fp(stderr);
            return -1;
    }
    ciphertext_len += len;

    EVP_CIPHER_CTX_ctrl(ctx, EVP_CTRL_AEAD_GET_TAG, 16, ciphertext);
    ciphertext_len += 16;

    EVP_CIPHER_CTX_free(ctx);
    return ciphertext_len;
}

int main(void) {
        unsigned char key[] = { 0, 1, 2, 3, 4, 5, 6, 7,
                                8, 9, 10, 11, 12, 13, 14, 15,
                                16, 17, 18, 19, 20, 21, 22, 23,
                                24, 25, 26, 27, 28, 29, 30, 31,
                                0, 17, 34, 51, 68, 85, 102, 119,
                                136, 153, 170, 187, 204, 221, 238, 255,
                                240, 241, 242, 243, 244, 245, 246, 247,
                                248, 249, 250, 251, 252, 253, 254, 255 };
        unsigned char *msg = (unsigned char *)"Some data to encrypt.";
        unsigned char *aad = (unsigned char *)"Additional data";

        /* 16-bytes for SIV, additional 16-bytes for finalized data,
         * which shouldn't exist in this test case */
        unsigned char ciphertextWithAAD[strlen((char *)msg)+16+16];
        unsigned char ciphertextWithZeroBytes[strlen((char *)msg)+16+16];
        unsigned char ciphertextNoUpdate[strlen((char *)msg)+16+16];
        int ciphertextWithAAD_len;
        int ciphertextWithZeroBytes_len;
        int ciphertextNoUpdate_len;

        ciphertextWithAAD_len = encryptDeterministically(key, ciphertextWithAAD,
                                                         msg, strlen((char *)msg),
                                                         aad, strlen((char *)aad));

        ciphertextWithZeroBytes_len = encryptDeterministically(key, ciphertextWithZeroBytes,
                                                               msg, strlen((char *)msg),
                                                               NULL, 0);

        ciphertextNoUpdate_len = encryptDeterministically(key, ciphertextNoUpdate,
                                                          msg, strlen((char *)msg),
                                                          NULL, -1 /* Flag for no update */);

        if (ciphertextWithAAD_len > 0) {
                printf("Ciphertext with AAD is:\n");
                BIO_dump_fp(stdout,
                            (const char *)ciphertextWithAAD, ciphertextWithAAD_len);
        }

        if (ciphertextWithZeroBytes_len > 0) {
                printf("Ciphertext without any AAD is:\n");
                BIO_dump_fp(stdout,
                            (const char *)ciphertextWithZeroBytes, ciphertextWithZeroBytes_len);
        }

        if (ciphertextNoUpdate_len > 0) {
                printf("Ciphertext without Cipher Update is:\n");
                BIO_dump_fp(stdout,
                            (const char *)ciphertextNoUpdate, ciphertextNoUpdate_len);
        }

        return 0;
}
juergw commented 1 year ago

Thanks. I have now checked this, and indeed ciphertexts generated in Tink with empty AD will not be readable by OpenSSL.

But note that in the https://www.rfc-editor.org/rfc/rfc5297#section-6, it is written what needs to be done if AES-SIV is used in the normal AEAD interface (defined in RFC5116) that only has a single AD:

"Therefore, when it is required to access SIV through the interface defined in [RFC5116], it is necessary to marshal multiple AD inputs into a single string (see Section 1.1) prior to invoking SIV."

So Tink really does this as expected by the RFC. An empty string needs to be treated as a single empty AD.

I'm going to close this now. Thanks for bringing this up, this was very helpful!

I have also updated our documentation here: https://developers.google.com/tink/wire-format#deterministic_aead

juergw commented 1 year ago

Update: this turned out to be a vulnerability in OpenSSL. I reported that to the maintainers of OpenSSL, and they have now fixed it. See: https://www.openssl.org/news/secadv/20230714.txt https://www.cve.org/CVERecord?id=CVE-2023-2975

kjmph commented 1 year ago

Thank you for the update, so glad you followed through on that.