Feature: Add Annotations to KeysetInfo_KeyInfo

theory commented 5 months ago

Is your feature request related to a problem?

Not a problem, no, but I think KeysetInfo_KeyInfo could benefit from additional information. As it is, users looking at a keyset (e.g., when serialized to JSON) have limited information about the keys. More info helps assist security assessments and the like without the need for separate record-keeping that can get out of sync with the actual key templates used.

What sort of feature would you like to see?

I'd like the ability for key templates (or keys?) to annotate KeysetInfo_KeyInfo contents with more than just the type ID, URL, status, and prefix type.

I made a Tink key template and key inspired by KMSEnvelopeAEADKeyTemplate. It simply encapsulates an AEAD key and a MAC key in a single key:

message BinkKey {
  uint32 version = 1;
  google.crypto.tink.KeyData aead_key_data = 2;
  google.crypto.tink.KeyData mac_key_data = 3;
}

It supports all the Tink aead and mac keys, and works great!

However, for serialization of the key with key info, I get this:

{
  "encryptedKeyset": "...",
  "keysetInfo": {
    "primaryKeyId": 1075844356,
    "keyInfo": [
      {
        "typeUrl": "github.com/theory/bunk.key",
        "status": "DISABLED",
        "keyId": 723113882,
        "outputPrefixType": "TINK"
      },
      {
        "typeUrl": "github.com/theory/bunk.key",
        "status": "ENABLED",
        "keyId": 1075844356,
        "outputPrefixType": "TINK"
      }
    ]
  }
}

Which is not very informative. At a glance, the user cannot tell what kind of AEAD or MAC key is used by each key. A similar issue exists for KMSEnvelopeAEAD keys: you can't tell what kind of AEAD key is used to encrypt the data.

I'd like to request a way to somehow annotate KeysetInfo_KeyInfo with additional information, so the output could be designed for my keys to look more like this:

{
  "encryptedKeyset": "...",
  "keysetInfo": {
    "primaryKeyId": 1075844356,
    "keyInfo": [
      {
        "typeUrl": "github.com/theory/bunk.key",
        "status": "DISABLED",
        "keyId": 723113882,
        "outputPrefixType": "TINK",
        "details": {
          "aeadKeyTypeUrl": "type.googleapis.com/google.crypto.tink.AesGcmKey",
          "macKeyTypeUrl": "type.googleapis.com/google.crypto.tink.HmacKey"
        }
      },
      {
        "typeUrl": "github.com/theory/bunk.key",
        "status": "ENABLED",
        "keyId": 1075844356,
        "outputPrefixType": "TINK",
        "details": {
          "aeadKeyTypeUrl": "type.googleapis.com/google.crypto.tink.XChaCha20Poly1305Key",
          "macKeyTypeUrl": "type.googleapis.com/google.crypto.tink.AesCmacKey"
        }
      }
    ]
  }
}

Or perhaps with even more detail:

{
  "encryptedKeyset": "...",
  "keysetInfo": {
    "primaryKeyId": 1075844356,
    "keyInfo": [
      {
        "typeUrl": "github.com/theory/bunk.key",
        "status": "DISABLED",
        "keyId": 723113882,
        "outputPrefixType": "TINK",
        "details": {
          "aeadKeyTypeUrl": "type.googleapis.com/google.crypto.tink.AesGcmKey",
          "aeadKeyAlgorithm": "AES256GCMNoPrefix",
          "macKeyTypeUrl": "type.googleapis.com/google.crypto.tink.HmacKey",
          "macKeyAlgorithm": "HMACSHA256Tag256"
        }
      }
    ]
  }
}

I could imagine this being useful for KMSEnvelopeAEAD keys, too.

Have you considered any alternative solutions?

I can write my own serialization layer that encrypts the keyset and puts in its own key info. But it seems unnecessarily duplicative (is that a word?), and also a bit more error prone from a security perspective. Tink already has the KMS encryption of the keyset configured exactly right, why copy/paste it and then lose any improvements/fixes later on?

Would you like to add additional context?

Thanks for Tink!

tholenst commented 5 months ago

Thank you for the bug. We will not go the proposed route, but we agree that the situation is suboptimal. However, in Java we already have tools available to solve this and we hope to provide similar tools in Go as well. It will take a while, however.

First, we don't want to always output more information in the keyset since it enlarges the keyset. Also, Tink doesn't really want users to use the protos in many cases.

The solution we have in Java is that one can iterate through the keyset and call "GetAt(0).getKey().getParameters()" at any point to get full information about the parameters for that key. Such parameters can also be serialized (currently only into a byte string, which serializes them as binary KeyTemplate proto -- we might allow other things though). In this case, users can get full information about all the key templates.

It might be that one day we provide a CreateKeysetInfo function which takes a KeysetHandle and outputs a larger KeysetInfo including the KeyTemplate -- but that's rather far out and not our priority at all right now.

theory commented 5 months ago

Thanks for the reply! This sounds very promising. Is there some way to store parameters in the key today, so that once that feature lands they will be displayable?

tholenst commented 5 months ago

No, sorry. We are not changing the key format. Instead, we manually add code which allows to convert the Key to the parameters.

What one can do currently is to write a Java program which just prints the keys in a keyset. This would look something like this (untested code)

public static void main(String args[]) {
  byte[] b = Hex.decode(args[1]);
  KeysetHandle h = TinkProtoKeysetFormat.parseKeyset(b, InsecureSecretKeyAccess.get());
  for (int i = 0; i < h.size(); i++) {
    System.out.println(h.getAt(i).getKey().getParameters().toString());
  }
}

theory commented 5 months ago

But where are the parameters stored?

tholenst commented 5 months ago

The key protos have enough information to get the parameters out. (The parameters are the same as the key template).

For example, in the serialization, a AES GCM Key looks like this:

message AesGcmKey { uint32 version = 1; bytes key_value = 3; }

The parameters of a AES GCM Key looks like this:

message AesGcmKeyFormat { uint32 key_size = 2; uint32 version = 3; }

This is embedded into a KeyTemplate proto.

One can compute the AesGcmKeyFormat from the AesGcmKey proto.

theory commented 5 months ago

I'll have to poke around the implementation, but I'm little confused here. AFAIK the AesGcmKey is the value stored in the keyset, not the AesGcmKeyFormat. Given an AesGcmKey, how do you know what AesGcmKeyFormat to load? Or is the key template also stored in the keyset and I just haven't noticed it yet.

tholenst commented 5 months ago

It's not loaded directly, it's computed out of the AesGcmKey. It requires code specifically for AesGcm.

The key template is not stored in the keyset.

Does that make sense?

theory commented 5 months ago

Yes. Looking at the AesGcmKey example, I suppose it knows what the configuration values were based on the version? Cause that's the only bit of info I see in the key other than the key material.

theory commented 5 months ago

I'm afraid there's not much I can make of it one way or another. Could you maybe create a little sample project, zip it up, and attach it here? Important to bundle it into an archive to preserve the encodings. I could try to take it and spin up a GitHub workflow that uses an Oracle Docker image to replicate it. Unless, of course, you could do that :-)

tholenst commented 5 months ago

Ah sorry, I see I'm still being confusing.

Given a Key you can simply compute key_size in the format as key_value.size().

Unfortunately, I don't have time to create sample projects for this.

theory commented 5 months ago

Oooh, 🤦🏻 right I should have thought of that!

theory commented 4 months ago

I got pretty far down the line adding parameters to my key format and key proto, and updated the manager to copy them into the key primitive. This lets me implement a custom writer that emits key parameters instead of the default KeyInfo.

It works beautifully except for one thing: the resulting info includes only enabled keys. This is because the Primitives() method on the keyset handle returns only enabled keys:

https://github.com/tink-crypto/tink-go/blob/a313008a351c3368cbc86c44ab09bca528da3c58/keyset/handle.go#L206-L208

This makes perfect sense for keys loaded into a keyset used for encryption and encryption! But as there is no GetAt method on the Go Keyset handle, there is no way to iterate through all enabled and disabled keys. One can only get at keyset info.

So I'm wondering if it'd make sense to add a method like Primitives() but returns all primitives in the set, not just enabled primitives.

tink-crypto / tink-go

Feature: Add Annotations to KeysetInfo_KeyInfo #14