openwallet-foundation / acapy

ACA-Py is a foundation for building decentralized identity applications and services running in non-mobile environments.
https://aca-py.org
Apache License 2.0
419 stars 512 forks source link

DID Management Proposed Update #3343

Open dbluhm opened 2 days ago

dbluhm commented 2 days ago

In light of our current push to add support for more DID Methods to ACA-Py, some primitives within ACA-Py need some updates. This issue outlines the updates I propose. I plan to update this further as the topic is discussed or as implementations better inform decisions.

Proposed Updates

DID Storage (updating DIDInfo)

Current State

At present, the DIDInfo object looks like this:

https://github.com/openwallet-foundation/acapy/blob/f5c49b0710dd180ea31c45f73bc82ef06f9523b4/acapy_agent/wallet/did_info.py#L20-L29

This is stored in the wallet with a category of did, the primary identifier being the DID value, and the following tags:

As currently used, metadata will include:

Evaluation

As is plain to see, the structure, tags, and metadata of the DIDInfo object are very Indy-oriented. This structure has been in use for years.

Currently, ACA-Py will retrieve a DIDInfo object in order to use the key associated with the "DID." It will do this by taking a "DID" as input (usually, actually more like a "nym" value, i.e. 16 base58 encoded bytes without a did: prefix), then using the verkey value to retrieve a Key object that it can then use to perform a signature or pack a DIDComm message.

Solution

DIDs should have multiple keys associated with them rather than a single key. To achieve this while also having an efficient lookup mechanism, we should reorient our storage as outlined below.

Quick background on Askar

Askar is a secure storage solution used by ACA-Py. Askar encrypts all data and provides a tagging mechanism to enable lookup of encrypted records. An entry in Askar is composed of the following elements:

Askar has a dedicated API for storage and retrieval of keys. However, this API is conceptually just a shorthand for record storage and retrieval from a "private" key category with the key itself as the value of the entry. Key entries behave almost exactly the same as non-key entries, including names and tags.

Key Storage

Building off of Patrick's contributions of managing keys by multikey instead of "verkey," the multikey representation of a key should be the default identifier for keys in the wallet.

These sets of tags enable us to look up keys with a combination of did and rel; when these tags are lists, Askar will return all keys that contain the tag filter value in their respective list. This permits the controller to continue to specify just a DID as the issuer/signer/sender of a value without having to know exactly which key ACA-Py should use to perform the operation. This also permits the controller to continue to use the verification method ID directly to specify a key that might not normally be selected first. Additionally, when a specific proof type is desired, Askar can also filter by KeyAlg so a simple mapping from proof type to appropriate KeyAlgs can efficiently accomplish this filtering.

DID Storage

DIDs should be altered to be stored in a way that simply acknowledges that we own the DID and not as the primary key retrieval mechanism.

Migration

Existing ACA-Py wallets should have the following migration performed to accommodate this reorientation:

ankurdotb commented 2 days ago

Just thinking out loud, coming from how key management has been done in other wallets I'm familiar with like Veramo and blockchain wallets: they often allow keys to be referenced using friendly or "alias" names, e.g., when storing in keystore, I name a key as ankurs-main-key or ankurs-secondary-key. Note that this is just a keystore reference, not how the key needs to be referenced in the DID Document, e.g., while the specified key is ankurs-main-key, when set into a DID Document in the verification method/authentication section, it might be referenced/published as key-1.

Here's how Veramo currently stores keys when storing in databases:

Screenshot 2024-11-19 at 18 23 12 Screenshot 2024-11-19 at 18 23 19

Veramo stores private keys separately from public keys, storing them in a hex representation which has a alias as primary key. Personally, I'd store a unique key ID and then make alias a secondary unique constraint (might be useful in key deprecation/deletion scenarios, to only allow one "active" key to have a unique alias, but keep deprecated keys still referencable by a key ID and their known alias).

The public key table is a bit closer to what you described above @dbluhm. This table makes the assumption that keys are used as controllers/authentication for DIDDocs only, but I can see that perhaps the addition of a relationship property could accommodate the idea you were talking about as well as storing what key fragment it's known by in that DIDDoc.

I don't know how common it is for people to reuse keys across DIDDocs; I suspect not for primary controller/authentication, but perhaps for other key agreement properties. So I do think when keys are stored, they shouldn't have a hard assumption that it's only used/linked to one DIDDoc, or only one type of relationship (since the same key might be used in auth as well as assertionMethod as well as...)

dbluhm commented 2 days ago

I was thinking through this some more and made an important discovery/re-discovery; I was mistaken in thinking that tags could only be string values. A list of strings is also accepted, enabling us to tag a key with as many DIDs, aliases, verification method ids, and relationships as we want without having to worry about lookup challenges. I'll update the proposal with this info.

edit: the proposal has been updated

dbluhm commented 2 days ago

The updated proposal is better than the starting point but introduces a new problem; the structure assumes that the verification relationship of a key is the same for all the DIDs it's assigned to.

I like the simplicity of being able to directly tag and query keys but to enable multiple DIDs to use the same key in different ways, I think we would have to have a separate category of verification method records:

Performing an operation with a key where we're identifying it by VM ID or by DID + relationship/purpose would require a two step process. First lookup the VM by tags then lookup the key by ID/name.

dbluhm commented 13 hours ago

Yet another lookup mechanism idea. Instead of having a layer of indirection with the VM records as proposed in my previous comment, we could just have multiple key records that contain the same key material and are just tagged with different values. I think this means we would end up with two different kinds of "key records;" one where name is the multikey representation and one where the name is the verification method ID. The one with multikey name would be what is created for keys that are not (yet) bound to a DID. And then the other would of course be keys that are bound with a DID. Whether the same key exists in both representations depends on the DID method that we're creating a DID for. If we need to know the complete DID Document before the DID can be created, then we'll end up with unbound keys until the ID of the DID is known. For example, in did:peer, we have to generate our keys before we know the DID itself since the key material contributes to the DID creation. But did:web, as a counterexample, the DID is known (or can be known) before keys associated with it are created so keys in a did:web document could be created with a known Verification Method ID from the start.

Unbound keys

Bound Keys

This is essentially the same proposal as the separate VM record proposal with the critical difference that we just have multiple key entries for the same key material so we don't have to do a two step look up.