Improve UX for multiple split `masterIdentities`

I have a project which is maintained by multiple persons. Each person requires the ability to create new secrets with their personal split identity, associated with their Yubikey.

Single master identity

Having multiple split keys as a single user is usually not a problem, since normally only one of them is used as the primary key, allowing one to choose the following configuration:

age.rekey = {
  masterIdentities = [ ./primary-split-identity.pub ];
  extraEncryptionPubkeys = [ ./secondary-recipients.pub ];
};

Contents of ./primary-split-identity.pub, usually as generated by the Yubikey tooling:

#       Serial: 00000000, Slot: 00
#         Name: Some Name
#      Created: Mon, 01 Jan 2000 00:00:00 +0000
#   PIN policy: Once   (A PIN is required once per session, if set)
# Touch policy: Always (A physical touch is required for every decryption)
#    Recipient: age1yubikey1<primary-pubkey>
AGE-PLUGIN-YUBIKEY-<primary-split-identity>

Contents of ./secondary-recipients.pub, usually assembled by manually copying the pubkeys:

age1yubikey1<secondary-pubkey>

For this setup, it does not matter whether or not the secondary identities are split-identities or not, since the pubkeys are manually collected, anyways.

Multiple master identities

The situation changes however, if one considers the following configuration with two split masterIdentities:

age.rekey = {
  masterIdentities = [
    ./primary-split-identity.pub
    ./secondary-split-identity.pub
  ];
};

The ./secondary-split-identity.pub contains:

#       Serial: 11111111, Slot: 00
#         Name: Some Other Name
#      Created: Mon, 01 Jan 2000 00:00:00 +0000
#   PIN policy: Once   (A PIN is required once per session, if set)
# Touch policy: Always (A physical touch is required for every decryption)
#    Recipient: age1yubikey1<secondary-pubkey>
AGE-PLUGIN-YUBIKEY-<secondary-split-identity>

If one only has one of the Yubikeys plugged in and attempts to edit an encrypted file using agenix edit (I am assuming rage and age-plugin-yubikey for the backing implementations), the decryption works fine, since only one working key is necessary for decryption. However, on saving the file, the implementation of rage/age-plugin-yubikey prompts for every split identity. If you try to skip one of the identities, you will get this output:

Collecting information about hosts. This may take a while...
Error: '⁨age-plugin-yubikey⁩' couldn't use an identity: ⁨Could not open YubiKey with serial 11111111

[ Did rage not do what you expected? Could an error be more useful? ]
[ Tell us: https://str4d.xyz/rage/report                            ]
error: Failed to (re)encrypt edited file, original is left unchanged.

Equivalent behavior is obtained when invoking the following command directly:

echo "test" | rage -e -i ./primary-split-identity.pub -i ./secondary-split-identity.pub > ./test.age

As it appears, the backing implementation requires physical presence of all Yubikeys to derive the pubkey for each one, instead of looking at the Recipient: field in the preceding comment. While I would certainly consider this primarily an upstream issue, I think it still could be meaningful if agenix-rekey could work around this.

Possible solutions

In the simplest case, there is maybe already a solution that I am just not aware of. I am ready to be educated on this :) Otherwise I can think of a couple of ways to work around this, but there might still be other ways to approach this.

One could simply add an configurable option to skip all (or even just some) masterIdentities when encrypting something using agenix edit/agenix generate. Instead, the user can supply a manually assembled list of all the required identity pubkeys through the extraEncryptionPubkeys option. While this is more error-prone, I would assume that identities don't change that often, an so it wouldn't be a huge inconvenience. Still, it would put a larger burden on the user to manage their keys. This should be possible to implement in a way that is completely backwards compatible and will have no unexpected side effects for existing users, as it would be purely opt-in (maybe with a big warning that clarifies the risks of improper key management).
Another way could be to try and be smart about it and parse the comments in the masterIdentities files to try and extract a pubkey to specify with -r instead of -i. However that would seem much more like a solution for upstream anyways, so I don't know that this would be a great option here. I certainly feels a lot more hacky and might mess with peoples setups in unanticipated ways.

Again, I would be happy to explore some of these options in a PR, but I would first like to know how you feel about this topic in general.

I had another think about this today and came up with another way of handling this, which is basically a fancy variant of the first solution: Allow the specification of an explicit pubkey to use in place of the identity. I imagine something like this:

masterIdentities = [
  {
    identity = ./primary-split-identity.pub;
  }
  {
    identity = ./secondary-split-identity.pub;
    pubkey = "age1yubikey1<secondary-pubkey>";
  }
  ./tertiary-split-identity.pub;
];

Basically: Add an alternate syntax to specify master identities. If pubkey is specified, it will be used in place of the identity to encrypt the files. If pubkey is not specified, it will fall back to the current behavior. When decryption is attempted, it will still simply try all identity fields. This should also be implementable in a backwards compatible way, by means of something like the following type signature:

masterIdentities = mkOption {
  type = with types; let
    keyPathType = (coercedTo path toString str);
  in listOf (either
    keyPathType
    (submodule {
      identity = mkOption { type = keyPathType };
      pubkey = mkOption { type = nullOr keyPathType };
    }));
};

Certainly more complex from an implementation standpoint, for the end user however the UX would appear much cleaner, since identities and pubkeys are now clearly grouped together. Sadly, it would still be in the users responsibility to ensure that the identity and pubkey work together, since there is no way to write an assertion that checks this without requiring the user input that the above solution would be trying to avoid :/

Wow, looks like you already put a lot of thought into this! I've been aware that multiple split identities are a potential issue because someone I know used to try that (but ultimately reverted to a single-identity solution). I think your suggestions are very good, let me try to address them in context:

If one only has one of the Yubikeys plugged in and attempts to edit an encrypted file using agenix edit (I am assuming rage and age-plugin-yubikey for the backing implementations), the decryption works fine, since only one working key is necessary for decryption.

I think there is actually a problem here. I admit that I didn't check this right now, but when I tried to decrypt a file by specifying two split identities last year, the age-yubikey-plugin would always attempt to use the keys in the order they were specified. And if you only have the second yubikey that is specified, you ran into problems. I just found this relatively new issue, but it seems like it is still an issue: https://github.com/str4d/age-plugin-yubikey/issues/178

So to make it work properly, you would have to juggle keys between masterIdentities and extraEncryptionPubkeys so that there's ever only one master identity set until this issue is resolved.

As it appears, the backing implementation requires physical presence of all Yubikeys to derive the pubkey for each one, instead of looking at the Recipient: field in the preceding comment. While I would certainly consider this primarily an upstream issue, I think it still could be meaningful if agenix-rekey could work around this.

Interesting observation, I always assumed it would use the pubkey from the comment. Looks like I never tried encrypting without my yubikey attached. I guess the implementation only considers the keygrab and skips the comments, might be worth to ask upstream about this. Maybe a --use-pubkey-comments option would be able to solve this. In any case we can work around this by parsing the recipient ourselves before invoking rage as you already proposed.

If pubkey is not specified, it will fall back to the current behavior. When decryption is attempted, it will still simply try all identity fields.

This would still run into the multi-identity decryption problem if you don't have the "first" yubikey available. But I think there is a workaround to all of this. And while I think the new option syntax would be a good design, I think we might even be able to get away without changing it:

Let's assume a user has multiple master identities and some backup identities:

masterIdentities = [ ./yk1.pub ./yk2.pub ];
extraEncryptionPubkeys = [ ./yk3.pub "age1yubikey1......." ./backup.pub ];

To make this work for multiple users, we just need a way to specify which of the master identities is the "primary" one that should be used for decryption. For encryption we can just always use -r and parse the Recipient:, and never use -i to avoid needing the yubikey. If no identity is selected as the primary one, we just keep doing what we do now and specify multiple identities in decryption.

We need to make sure that selecting the primary master identity does not change the resulting nix derivations. So a primary = true; setting (or anything like that) would be a bad idea. Especially because you probably don't want to keep a file with local changes when multiple users are developing on the same repository. So my suggestion would be to introduce an environment variable AGENIX_REKEY_PRIMARY_IDENTITY:

If unset, the current behavior is kept, meaning all masterIdentities are passed via -i.
If set, it should be set to a recipient like age1yubikey1.... which allows us to exactly identify which of the masterIdentities is meant to be used. We can verify that it actually matches one of these master identities to prevent misuse.

Now, when constructing a call to rage --decrypt, we handle all non-primary identities in the same way as extraEncryptionPubkeys. So only the recipient options -r or -R are used by parsing out the Recipient: line from any split identities. Only the primary identity will be passed via -i. For encryption calls we don't need to make a distinction at all, just always extract the pubkey and never use -i in the first place.

A nice side effect to this would be that specifying ./yk3.pub in extraEncryptionPubkeys will now work, which currently fails because -R is used for files and that doesn't accept split identities.

The advantage of using an environment variable is that you can now easily alter which master identity is in use simply by changing the environment. If you are using a devshell, it would be trivial to set the variabled based on the username, hostname or any other property that distinguishes the developing parties from each other.

Again, I would be happy to explore some of these options in a PR, but I would first like to know how you feel about this topic in general.

Of course, I'd be very happy if you want to contribute this!

Thinking about what needs to be changed to implement this, I believe it shouldn't actually be too much since the option syntax stays the same. Roughly speaking we need to defer the rage argument list construction in nix/lib.nix to a runtime bash function and depending on whether AGENIX_REKEY_PRIMARY_IDENTITY is set, change the argument list for decryption to only include exactly one -i for the correct file. We could always use -R <(parseRecipients "$file") for the rest which would make sure that multi-recipient files still work as expected.

Oh wow, I didn't realize that there is another issue woven into this. I guess we should probably start by doing some testing to verify what works and what are definite issues that need special treatment. The details of the exact Nix API we can probably still figure out once we have a working prototype. There are some potential questions there that I could come up with on the spot:

What to do if no Recipient: field is found?
- Probably just keep the old behavior?
- What do we do if only some identities have extractable recipients?
Do we still want to have the possibility of a "manual override", i.e. the option to specify the pubkey explicitly without touching the identity file?
Should we expose the parsed pubkeys to the user for reuse somewhere else in some Nix code? Is that a usecase we want to support?

However, the general approach of using an environment variable for selecting specific identities sounds like a nice and flexible solution to me. I will hopefully be able to do some work on this in the coming days and will post back here with any findings.

Ok, I managed to do some testing and I am now very confused. I started by trying to validate the issue you referenced. I am fairly certain that I could confirm it initially, since I had explicitly noted this down. However, since some time yesterday, I am no longer able to reproduce this. Instead, I am free to skip over my non-available key identity and just use the other one. Doesn't matter which of my two keys I plug in and in what order I specify the identities in: neither age (Version 1.1.1) nor rage (Version 0.9.2 and 0.10.0) nor age-plugin-yubikey (Version 0.4.0) with pcsc-lite (Version 2.2.3) want to throw errors at me. So I really do not know what is happening there.

However, even though no error comes up, I still get the alternate behavior referenced in the issue: If the identity I am trying to use for decryption is not the first one, I am prompted for the PIN for every rekey action. Ultimately this means that this is still an issue we should work around and that I can test for, even though I can currently not reproduce the original behavior.

With all of that out of the way, I started experimenting with making the encryption step independent of the physical keys. While I was experimenting with some simple grep commands for extracting the pubkeys from the comments, I stumbled across some stuff that might influence the implementation design and final UX and I think is worth discussing first. It basically boils down to the types of identities that exist, and that they all require slightly different interactions:

Plain `age` identities, no password encryption

These are the simplest ones, since they can always be used directly to derive the pubkey. In case we would want to use the pubkey instead of the actual identity for some reason, age-keygen has a default output format that will look like this:

# created: 2000-01-01T00:00:00+00:00
# public key: age1<pubkey>
AGE-SECRET-KEY-<key>

Instead of Recipients: age1yubikey1<pubkey>, we have public key: age1<pubkey>.

Plain `age` identities, with password encryption

These have similar problems as Yubikey identities, in that decryption of the identity is required when encrypting a file. For these however, we don't even have a way of extracting a pubkey from the file, since the whole file is encrypted. So in this case we might require an external way to specify the associated pubkey, anyways.

Other types of split-identities

Apart from age-plugin-yubikey, https://github.com/FiloSottile/awesome-age#plugins lists quite a number of other plugins. I have not looked at them too closely, but this is a good reminder that there might be more complex interactions that could benefit from a more generic approach to specifying the pubkeys.

Multi-identity files

Just for completion: The -i flag of both age and rage technically allows for multiple identities in a single file. This would complicate stuff immensely, since at that point we would basically need to parse the entire file and find some way to create parings of pubkeys and identities. This might also be the reason that (r)age does not try to extract the pubkey from the comments. However, I think in our case it is not unreasonable to ask the users to split their identities up, into one identity per file, so that we do not have to worry about this.

Now, with all of those interactions in mind, I find myself asking: What do we want to support and which solution gives us and the users the most flexibility? My first prototype replaced the masterIdentityArgs implementation in nix/lib.nix with something along these lines:

mergedMasterIdentities = mergeArray (x: x.config.age.rekey.masterIdentities or []);

masterIdentityArgs = let
  parseRecipients = with pkgs; writeShellApplication {
      name = "parse-recipients";
      runtimeInputs = [ gnugrep ];
      text = readFile ./parse-recipients.sh;
    };
in ''$(
  ${parseRecipients}/bin/parse-recipients ${
    concatStringsSep " " (map (x: toString (escapeShellArg x)) mergedMasterIdentities)
  }
)'';

./parse-recipients.sh creates the actual list of arguments for (r)age, taking care to avoid skipping identities during encryption if no explicit pubkey can be extracted:

# Collect final arguments in an array
args=()
for file in "$@"; do
  # Only consider files that are a Yubikey identity.
  if grep -q "^AGE-PLUGIN-YUBIKEY-" "$file"; then
    # If the file does not specify "Recipient: age1yubikey1<pubkey>", just use the identity itself.
    # Otherwise extract recipient and specify with "-r".
    if pubkeys=$(grep 'Recipient: age1yubikey1' "$file" | grep -Eoh 'age1yubikey1[0-9a-z]+'); then
      for pubkey in $pubkeys; do
        args+=("-r $pubkey")
      done
    else
      args+=("-i $file")
    fi
  else
    args+=("-i $file")
  fi
done

echo "${args[@]}"

But while this adds support for age-plugin-yubikey, it does little to improve interactions for the other types of identities. Furthermore, this approach would require adding explicit support for every type of identity file that needs special treatment.

In my mind, it would make a lot more sense to just let the users specify the pubkey (or a script that echos the pubkey) directly, which would allow them to implement arbitrary custom solutions, if required. We can of course always supply some helper functions that can automatically extract the pubkey from explicitly supported identity files.

What are your thoughts on this, do we focus on a specialized solution for the yubikey plugin or aim for an architecture with broader support, with some helpers for the common use cases?

Ultimately this means that this is still an issue we should work around and that I can test for, even though I can currently not reproduce the original behavior.

Very interesting, maybe I should acquire a second yubikey so I can also try to test this,

Instead of Recipients: age1yubikey1<pubkey>, we have public key: age1<pubkey>.

Fair, we can just search for that too, if you want to support this case.

These have similar problems as Yubikey identities, in that decryption of the identity is required when encrypting a file. For these however, we don't even have a way of extracting a pubkey from the file, since the whole file is encrypted. So in this case we might require an external way to specify the associated pubkey, anyways.

Oh we can add an explicit pubkey specification if you want, I'd just like to make sure that it isn't required for the common default case (i.e. normal yubikey split identities, nothing special).

Apart from age-plugin-yubikey, https://github.com/FiloSottile/awesome-age#plugins lists quite a number of other plugins. I have not looked at them too closely, but this is a good reminder that there might be more complex interactions that could benefit from a more generic approach to specifying the pubkeys. [...] Just for completion: The -i flag of both age and rage technically allows for multiple identities in a single file. This would complicate stuff immensely [...]

As already said, I'd just support extraction for the easy common cases, everything else doesn't need to be automatic. Basically just make sure that a multi identity file still is passed correctly to rage, but if you need to extract the pubkey just skip it or print a warning. No need to introduce complex logic for these cases since they are very niche.

What are your thoughts on this, do we focus on a specialized solution for the yubikey plugin or aim for an architecture with broader support, with some helpers for the common use cases?

I pretty much agree with what you've written. So I'd say introduce the { identity = ./file; pubkey = "..."; } alternative for specifying identities, and always use the pubkey if it is specified that way. If an identity is given directly, try to extract the pubkey if necessary by grepping for Recipient or public key. If that fails, show a warning or whatever is appropriate.

One more thing for when you are trying to create the command line for (r)age: Either create a function that wraps the (r)age call, or create a wrapper script that wraps (r)age, whatever is more convenient. But don't try to return the parsed argument list via some form of $(parse-recipients ...), because then you will immediately run into bash argument splitting issues. You probably know the drill, as soon as a file or path contains a space we are going to have a problem :D

[...] I'd just like to make sure that it isn't required for the common default case (i.e. normal yubikey split identities, nothing special).

Ah, I think I understand your vision now, makes sense. Then I'll add the dedicated Yubikey support to make it work out of the box, alongside with the generic pubkey support for more general use.

Either create a function that wraps the (r)age call, or create a wrapper script that wraps (r)age, whatever is more convenient. But don't try to return the parsed argument list via some form of $(parse-recipients ...), because then you will immediately run into bash argument splitting issues.

True, that really was quite a naive approach, I'll find a proper solution for the actual implementation.

Thank you for the feedback!

oddlama / agenix-rekey