tink-crypto / tink

Tink is a multi-language, cross-platform, open source library that provides cryptographic APIs that are secure, easy to use correctly, and hard(er) to misuse.
https://developers.google.com/tink
Apache License 2.0
13.47k stars 1.18k forks source link

Envelope AEAD Performance with GCP KMS #697

Closed iamyohann closed 7 months ago

iamyohann commented 1 year ago

Hi,

We're using Envelope AEAD with Google Tink library with a cloud provider.

Our DEK is AES256 - GCM.

We're noticing performance of encrypt/decrypt is around 200-300ms with a payload size of around a 100kB~.

Is there any potential to improve performance here?


Update:

We noticed the issue with performance has to do with latencies on a Cloud Vendors API (this would be a cloud providers key management service like AWS, GCP, Vault etc...).

The cloud API latencies vary between 20ms - 300ms. When we try to parallelize the calls to decrypt our payloads via Google Tink, all the go-routines take as much as the longest thread. So if we had 300~ payloads to decrypt in parallel, if 80% of the API calls took 20ms, but even if 1 or more Api calls took 250ms, then the entire group of go-routines takes 250ms. The API latency is basically inconsistent and ranges between 20ms - 300ms. We can confirm the latency is due to the Cloud API call rather than the Tink library itself.

We'd like to ask for a feature request where we could re-use the same DEK across multiple payloads.

This would allow customers to choose how to "spread the DEK across a portion of data".

The problem here is we have N payloads we'd like to encrypt using envelope encryption.

The key constraint is the output sent back to the end-user has to contain exactly N payloads as well as the system has constraints around what is considered a granular chunk of data.

Essentially, there's no way for us to take N payloads, wrap them in a wrapper, and serialise them to bytes, because the N payloads that are encrypted end up in another vendor system that depends on the idea that these payloads are separate pieces of information.

Does it make sense to add a separate function in this file https://github.com/google/tink/blob/master/go/aead/kms_envelope_aead.go#L91

// Decrypt implements the tink.AEAD interface for decryption.
func (a *KMSEnvelopeAEAD) Decrypt(ct, aad []byte) ([]byte, error) {
    // Verify we have enough bytes for the length of the encrypted DEK.
    if len(ct) <= lenDEK {
        return nil, errors.New("kms_envelope_aead: invalid ciphertext")
    }

where instead of taking a single byte array, it takes a list of payloads/byte-arrays, and requests for a single DEK to encrypt all the byte-arrays? Or is there a more safer, secure and performant approach instead, that can handle multiple payloads without having to call cloud APIs for each DEK?

The reason we ask for this feature where we can have a single DEK across a list of byte-arrays/payloads is because:

Regards

juergw commented 11 months ago

Based on your description, I think using envelope encryption like this is probably not the right choice for you. Have you considered using encypted keysets?

See: https://developers.google.com/tink/generate-encrypted-keyset

Here, you only call the KMS when you generate a new keyset (because you need to encrypt it) and when load the encrypted keyset to create the primitive (because you need to decrypt the keyset). But then, you can use the primitive as many times as you want, without having to make any RPC calls.