input-output-hk / prism-did-method-spec

Apache License 2.0
15 stars 1 forks source link

Extend PRISM DID Method to include network reference #58

Open bsandmann opened 4 months ago

bsandmann commented 4 months ago

Proposed feature

The current PRISM DID method does not include a reference to the underlying VDR on which the DID was published. This was a reasonable approach in the past but will cause issues going forward when PRISM will be expanded to other chains. The proposal is to publish a new version of the PRISM spec (1.1 or 2.0?) to accommodate this change. The new PRISM DID would then look like this: did:prism:cardano:preprod:123.

Feature description

Why is the current specification an issue?

Creating a PRISM DID is independent of the VDR. This means a DID, once created in memory, can be published on multiple networks and later updated independently on each VDR. Currently, this situation is mostly unproblematic, since the Cardano mainnet can be treated as the source of truth in case the same DID does exist on different networks at once. With the outlook to make PRISM agnostic to the VDR, this will cause a problem. As it stands today, we might be facing this challenge earlier than expected as Midnight is already available for testing purposes. An important point to highlight here is that having the same DID on multiple networks is not only an issue of confusion but could also pose a major security concern, since an node-indepentent resolver cannot determine the single source of truth.

Naming Options

While most other DID methods handle the network reference with just a single string delimited by a colon, e.g., did:prism:preprod, this might not be sufficient for our use-case when expanding to other chains. To keep the DID as simple as possible, one could imagine a naming convention to keep the name short, e.g., did:prism:cp (for Cardano-preprod) or "mm" (for Midnight mainnet). The more understandable alternative would be to use the full name, e.g., did:prism:midnight:mainnet. This would also be aligned with what Indy is currently doing, e.g., did:indy:idunion:test.

Implications

The change would affect nearly every component (SDKs, agent, node) and therefore requires a coordinated approach. A possibility would be to support both versions of the spec for some time and then remove support for the old spec after all components completed the migrations. While most changes are pretty light, there might be a potential issue with a change required to the protobuf definition and therefore the node. I haven’t looked into that in detail, so this has to be evaluated by someone. One option would be not to do the change in the node itself and work around that for now, which seems to be feasible at first glance. On the other hand, this change might open the chance for a larger rework of the PRISM spec, in light of a potential Midnight implementation and other possible additions (e.g. "controller", "AlsoKnownAs")

Anything else?

The main reason I’m raising this issue/feature request now is that I believe this change should rather be done sooner than later. The current user base is still very small, and the people noticing this change apart from this group can be counted on one hand. This will change rather quickly: At first, through new projects coming from Catalyst, and later with Lace. I would argue that this has to be completed before the first version of Lace with identity features ("Identity Center") will get into the hands of any (test) users. Doing this change to the PRISM spec at a later point in time might cause much greater headaches than doing it now.

This feature request moved over from here

mkbreuningIOHK commented 4 months ago

@patlo-iog @EzequielPostan @lohanspies to review the business value for this request and then decide implementation steps.

FabioPinheiro commented 4 months ago

As far as I remember from the previous discussions about this. We decided that:

I also have other security concerns. Make it impossible* to have similar DIDs that only the differences on the network. For example IMO: did:prism:HASH1 and did:prism:testnet:HASH1 should be impossible. The HASH is calculated based on the first block (AtalaOperation for the DID) written to the network. So I would propose that all other networks (other than the mainnet) MUST have a reference to the network in the initial AtalaOperation. This will make the DID completely different even if they are using the same keys.

EzequielPostan commented 4 months ago

First of all, thank you for creating an issue! :)

Second, my apologies for the delay, I was away.

Now, responding to the comments.

Creating a PRISM DID is independent of the VDR. This means a DID, once created in memory, can be published on multiple networks and later updated independently on each VDR.

If we look at the spec, the very first sentences in the abstract of the PRISM DID method spec says:

The prism DID method defines data models and protocol rules to create, manage, and resolve Decentralized Identifiers (DIDs). The protocol is defined on top of the Cardano blockchain as its verifiable data registry, where DID's information is stored.

meaning that a DID corresponding to the did:prism DID method, is by definition on the Cardano mainnet network.

[...] Currently, this situation is mostly unproblematic, since the Cardano mainnet can be treated as the source of truth in case the same DID does exist on different networks at once.

It is true that anyone can take the raw protobuf operations and post them somewhere else, but that doesn't create a (published/short-form) DID according to the did:prism DID method. Valid update/deactivate operations of the said DID method are required to be published on Cardano mainnet. It is not recommended for any user to create a valid (signed) Create/Update/Deactivate operation if it is not intended to be posted and processed. So, it is even less recommended to share such an operation publicly.

[...] since an node-indepentent resolver cannot determine the single source of truth.

I am not following what a "node independent resolver" refers to in this context. Note that the method name (i.e. the "prism" in "did:prism") is precisely what indicates that the DID to resolve is following the PRISM DID method spec, and as a consequence the needed data (operations) are on Cardano mainnet.

Now, taking the general idea of multi-network DID methods. From an architectural point of view, the more underlying chains you add to a DID method, the more costly is to run it. If we extend PRISM DIDs to be publishable in say, testnet, pre-prod, and mainnet, then a PRISM node would either need to follow (i.e. sync) all those chains to be able to resolve any DID of the DID method; or follow a subset of the chains and not be able to resolve DIDs from the chains it does not follow. It also would bring issues of "replay attacks" that you (and @FabioPinheiro) are referring to (i.e. posting valid operations from one VDR on-another).

With the outlook to make PRISM agnostic to the VDR, this will cause a problem.

Maybe I can clarify something here. PRISM, the system, is not the same as did:prism the DID method. Any system can rely on multiple DID methods (which may rely in different chains). There is no need to extend an underlying method to support other chain. For example, any agent could use simultaneously did:prism and did:ion DIDs (from Cardano and Bitcoin respectively) without the need to change any of did:prism nor did:ion methods. It is just a matter of the said agent to resolve each DID with the corresponding DID resolver and to post operations in accordance to each DID method spec. There are specifications and code on a universal resolver and a "universal registrar" to manage DIDs from multiple DID methods in uniform ways.

In short, supporting different chains/VDRs does not necessarily require a change in the DID method. In the context of Midnight, applications could just use did:prism as is, or other DID method of the convenience to each specific app, or even develop a custom DID method if the app sees need.

We are, of course, open to talk about extending the current PRISM DID method, but I am not sure if your use case reflects a real need. Please feel free to add any clarification and/or point out anything I may be missing.

Once again, thank you for opening the issue!

bsandmann commented 4 months ago

Thank you for the detailed answer @EzequielPostan

I believe the central issue here seems not to be a technical one, but a different understanding of what PRISM can mean, and the business decisions that have been made or ought to be made in the future. Allow me to explain: PRISM can mean multiple things: the brand, the DID-method specification, and the overall codebase. Aligning all three under the term “PRISM” has been coherent in the past, and your answer is a reflection of that: Yes, did:prism is just referring to the mainnet, and yes, the spec only mentions the mainnet. Under the assumption that this will also be the case for the future, the change isn’t necessary at all – I agree. If someone wants to use another VDR, just use another DID method.

But things got stirred up a bit: Part of the codebase is going open-source (Indy Identus?) and Midnight is coming along. These pose challenges to the coherency, and this issue here is trying to be a partial solution to this. The solution I propose here is to keep the PRISM branding (because it is somewhat established by now) and therefore the DID-method name did:prism. From a purely technical perspective, I would agree that it would make sense to just use another DID method for Midnight and other VDRs, to keep the spec clean and the method-namespace clear of any additions. But my fear is that using another DID method for Midnight or any other VDR, but keeping the PRISM brand for other parts, might cause more confusion than we already have.

Changing the spec to allow for different VDRs/networks and making the necessary changes in the related codebases now is a relatively low price to pay, in light of the alternatives. This is especially true because the PRISM spec is, apart from the wording, from a technical perspective quite independent of the VDR.

From an architectural standpoint, the more underlying chains you add to a DID method, the more costly it is to run it.

I wouldn’t really agree with that if we add the namespaces as suggested. We can have different node implementations for different VDRs or networks in general which don't need to know anything about each other: they would simply reject any request which isn’t targeted at their specific namespace. E.g., we have a PRISM Scala node which just handles did:prism:cardano:mainnet operations, another instance of the same node which handles did:prism:cardano:preprod operations, and a third node which might be written in Rust which handles all did:prism:midnight:mainnet operations.

In the context of Midnight, applications could just use did:prism as is

Wouldn't this exactly cause the architectural issue you are describing? We would need to have a single node keeping up with both networks.

Getting back to the main point: While I believe we can all find common ground on the technical or even architectural implementation, the core of the issue is a business decision on how to handle the PRISM brand and how the relationship between the PRISM brand, the spec, the public codebase (Identus?), the closed-source products and the different VDRs should look going forward. Maybe it would even be a consideration to introduce did:identus as the future spec?

EzequielPostan commented 3 months ago

Once again, apologies for the delay on my part

I will first comment on the technical side.

I wouldn’t really agree with that if we add the namespaces as suggested. We can have different node implementations for different VDRs or networks in general which don't need to know anything about each other

the thing is that, even if you have namespaces like did:prism:network1:... and did:prism:network2:..., both DIDs would still belong to the same DID method, namely did:prism. This is important to note, as if you have multiple chains, the guarantees each chain offers may be different. Different chains have different indexing costs, throughput, latency, fees, settlement time, rollback guarantees, data availability guarantees, and so on. As a consequence, the DIDs "hosted" under each chain will end up having different properties too.

Now, I raise the above because the technical goal behind having a DID method is to guarantee properties about the DIDs it supports. Making the multi-chain method makes the general properties (the ones applied to all the DIDs in the DID method) weaker.

On your question of the architectural issue:

Wouldn't this exactly cause the architectural issue you are describing? We would need to have a single node keeping up with both networks.

Correct, the issue of supporting N DID methods is, in structure, the same as supporting multiple networks under a single method. However, the SSI space already structured the universal resolver as a solution for the former. For interoperability, the system will eventually need to use more DID methods, which most likely will use the universal resolver. Re-creating this issue at a DID method level does not seem to pose many technical advantages over using the already adopted solution in the space.

Now, on branding and naming. I can understand that there may be challenges on keeping things clean and coherent in this area. However, I still incline myself to believe that this issue should be tackled on a branding side, and not a spec side.

Please let me know if I can add anything else to this conversation. I hope my answer helped here.

And, once again. Thank you for the time to post your comments.

bsandmann commented 3 months ago

@EzequielPostan Thank you for your detailed response, and now I must apologize for my delayed reply. While I certainly agree with some points you've raised, I'd like to offer a different perspective.

In your argument, you closely associate the DID method with its underlying VDR. I am not convinced that this association is necessary, nor is it intended by the DID-core spec itself. Rather than defining the properties of the VDR, the DID-core spec (as well as the PRISM spec) focuses on the feature set provided (i.e., the data model, representation, and available operations). The spec does not detail the assumptions one might have about the properties of the VDR.

Different chains have different indexing costs, throughput, latency, fees, settlement time, rollback guarantees, data availability guarantees, and so on.

Indeed, different chains have different properties. However, I don't see a direct link between the DID method itself and the properties of the VDR. It's reasonable to have different expectations for different networks, as is already the case with other chains. For instance, did:indy:indicio and did:indy:sovrin both refer to mainnet networks with production data but have different VDR properties, costs, governance structures, and even feature sets (e.g., Sovrin does not support arbitrary DID-document data). Thus, shifting assumptions from the method to the network level is not unprecedented and offers considerable flexibility depending on one's requirements. For example, one could even envision a did:prism:iog network, which is a permissioned network similar to Sovrin, removing the need for Trust-Registries for certain use cases. Moreover, the expectations one might have of the underlying VDR are not part of the spec – and of course they shouldn’t be. That said, the spec defines the feature set independent of the VDR. And having a common feature set that spans multiple VDRs can be a very useful approach.

the SSI space already structured the universal resolver as a solution for [this]

I don’t think so. The universal resolver/registrar doesn’t support a unified feature set. In many cases, it's simply not possible to take the same DID and anchor it on a different VDR. Taking the Indy network as an example again: You can't take a random DID-Document written on Indy (e.g., indy:indicio, indy:bcovrin, indy:idunion) and write the same information as a PRISM DID because the PRISM specification doesn't capture properties like controlledBy or AlsoKnownAs. The same is true in reverse; for instance, you can't represent any PRISM-DID on the indy:sovrin network. My point is, while the universal resolver/registrar offer interoperability to a certain degree, they don't provide feature parity, even if it might seem so at first glance. They offer a unified interface, but not the possibility to establish a unified feature set for DID-methods. While I'm not advocating for extending PRISM across all available VDRs, I see a benefit in extending it to specific chains (e.g., Midnight) where we may have an interest in, rather than duplicating the specification.

I still incline myself to believe that this issue should be tackled on a branding side, and not a spec side.

I believe the strategic vision for PRISM should be the major factor here. In case IOG/Atala decides to maintain the PRISM brand and offers products that are VDR agnostic, it would also make sense to have a DID method with the same name that is likewise agnostic, providing the same feature set seamlessly across multiple VDRs. This approach still leaves room for the universal resolver/registrar approach for those chains where IOG/Atala has less vested interest.

I'm playing a bit of devil's advocate here, as I believe both arguments on handling different networks have their merits. In my opinion the deciding factor isn’t purely technical but rather depends on the direction IOG/Atala wishes to take product-wise.

Aside from the discussion on a VDR agnostic PRISM method, there's still the matter of the necessity for a preprod/mainnet namespace. While I’m divided on the VDR topic, I’m strongly in favor of adding the namespace for at least the preprodnetwork, to be more explicit, avoid confusion and close the door to potential security issues.