w3c-ccg / did-spec

Please see README.md for latest version being developed by W3C DID WG.
https://w3c.github.io/did-core/
Other
125 stars 45 forks source link

consider requiring id to identify immutable objects #238

Closed dhh1128 closed 4 years ago

dhh1128 commented 4 years ago

A concern that @msporny has raised several times, including here, is that relative URIs inside a DID doc may be dangerous because they can create ambiguity.

I'm not convinced of this. HTML's ambiguity (<a href="foo">, <form name="foo">, <a resource="foo"> combined with "#foo as a reference) is not a problem in DID Docs if we keep the current base resolution algorithm, as Manu points out here.

But I believe there is one ambiguity that's currently possible, and that clearly has the potential for mischief: temporal ambiguity. If I read the spec right, it is possible for someone to define a DID doc in a way that the ID for a particular key (let's call it ID X) means Y at one point in time, and then changes to mean Z at a different point in time. This is dangerous, as hackers could manipulate which point in time is used by different parties. It also makes caching strategies even more problematic and complex than they already have to be.

I suggest that we add a sentence to the spec requiring the JSON object identified by an id property in a DID doc to be immutable, meaning that nothing about that object's properties can change once declared. This requirement is accidentally true of several DID methods, and I made it an explicit property of did:peer that I've been working on. I don't think it introduces any meaningful hardships on DID users or developers, though we should satisfy ourselves that this is the case before updating the language.

dhh1128 commented 4 years ago

If we end up agreeing that we should do this, I'd like to raise a PR. But I'll wait and see what the team thinks first.

peacekeeper commented 4 years ago

But I believe there is one ambiguity that's currently possible, and that clearly has the potential for mischief: temporal ambiguity.

@dhh1128 What you call "temporal ambiguity" could also be called "portability". The fact that a DID URL can be dereferenced to different resources over time has so far mostly been considered a feature, not a bug.

Perhaps for keys, I could see how this is more dangerous than it is useful, and I can see how we may want to recommend (or require?) that DID URLs for keys are unique (also see this issue with good discussion by @kdenhartog: https://github.com/w3c-ccg/did-resolution/issues/37).

But for services, there are clear use cases for having persistent DID URLs that are deferenced to different service endpoints over time.

This topic was also my first item of feedback to the Peer DID Method spec in the new Google Group (not sure if that is public?).

dhh1128 commented 4 years ago

I definitely agree that it's desirable for a person to change the URI associated with a service exposed by their DID.

The stronger form of this claim is that it should be possible to do this using the same service id. I am somewhat more hesitant about the virtue of this claim.

The weak form of the claim could be satisfied as follows:

time[0]: DID Doc exposes a service of type "DIF Hub" at URI abc,
as described in a JSON object with id=X

time[1]: DID Doc exposes a service of type "DIF Hub" at URI bcd,
as described in a JSON object with id=Y

The strong form would be nearly identical, except that there is no id=Y:

time[0]: DID Doc exposes a service of type "DIF Hub" at URI abc,
as described in a JSON object with id=X

time[1]: DID Doc exposes a service of type "DIF Hub" at URI bcd,
as described in a JSON object with id=X

Both forms let you change your "DIF Hub" endpoint. The difference is whether the ID remains constant as you do so. We already have the most important indirection, which is the DID Doc itself. That allows metadata about the DID to change over time, while holding the DID constant. Adding a further requirement that items of metadata should also have an individual indirection property feels superfluous (besides being dangerous as I suggested above).

But I may be making an important assumption that is coloring my perspective, that needs to be walked back. It is this: I am assuming that the key characteristic of a service is its type, not its id. Example 11 in the DID spec shows a whole bunch of service endpoints--but notably, it doesn't show two of the same type. If that one-service-per-type characteristic is going to always be true, then the invariant across edits to a service isn't really the id property--it's the type. "My hub endpoint" is what really needs to stay invariant, not "the endpoint with id=X".

I would point out that if you make id the invariant property, then I can change the endpoint's type in the future. Talking to a service with id=X today doesn't mean you can talk to it tomorrow, because besides its URI changing, its type might change to something you don't support.

You might say, "Oh yes, totally, we can define multiple service endpoints of the same type." But can we? If such a DID doc is created, what metadata does it provide to help someone decide which "DIF Hub" endpoint to talk to? Can I talk to one today, and one tomorrow--or do I need to pick the same one consistently, due to cumulative state? Is one preferred over the other--and if so, why, and how much?

In Hyperledger Aries discussions, we've been assuming that the URI in a service object could be an array (or unordered set), where all of the items in the sequence are guaranteed to be semantically interchangeable--the consumer of DID doc data can select any of them and expect the same behavior. This explicitly answers the questions I just threw out, and it means that you never need to have more than one service of a given type. Type is the invariant property, not id.

^^ @telegramsam @tplooker @kdenhartog (who have been discussing the service block in DIDComm decorators)

peacekeeper commented 4 years ago

So even if theoretically we agreed that "Type is the invariant property, not id", wouldn't all your original concerns about temporal ambiguity (hackers manipulating time, caching strategies) still apply in the same way?

In other words: If instead of referencing "the endpoint with id=X", we decide to reference "My hub endpoint", how is the time factor any less dangerous?

dhh1128 commented 4 years ago

If type is invariant, then the way you update what is semantically "my DIF hub endpoint" is not to edit it. Instead, it's to delete the old item and to introduce a new one that has the same type but a different id property. When you do this, you explicitly invalidate all caches (which are id-based). The temporal duration of a particular endpoint is exactly and only the duration of the existence of its id in the DID doc; if you cross a boundary where an id comes or goes, you know you're dealing with a new state. There's only one version, ever, of a given JSON-LD node in the doc.

peacekeeper commented 4 years ago

Hmm to me this sounds like you may be proposing new limitations for the DID Document data model, in order to satisfy a need (CRDT) that is specific to the peer method. Perhaps you could introduce a new property for the CRDT functionality which is different from id (e.g. crdt-id), in order to separate the two layers of functionality in a clean way?

talltree commented 4 years ago

@dhh1128 We've actually had a use case since the start of the DID spec work three years ago that shows why you'd want to keep the id stable for a service object whose service endpoint URI value changes. The pattern for that kind of a DID URL (in this case for a blog service) looks like this:

did:example:12345;service=blog/path/post?query

The idea is that the DID subject could switch blog hosting providers but all of the DID URLs that worked with his/her previous hosting service would keep working. As you put it, the DID URL and the DID document provide the abstraction layer that enables persistence, i.e., links not breaking.

So in this case, you would explicitly want to keep the same id value for a service object.

This causes me to wonder about your proposed principle that "every object with an id should be immutable". In fact the DID document itself can't conform to that principle because its id—the DID—by definition can't change, but the contents of the DID document can.

What about @peacekeeper 's suggestion of defining a different type of id-like property for immutable objects? That way immutable objects could carry that property, and anything that could change would not include it. So effectively you'd be marking "branches" of the DID document graph as immutable. A static DID document could even include that property at the top-level so the whole tree was marked immutable.

dlongley commented 4 years ago

So I think this should perhaps be handled at the DID method layer. Veres One, for example, imposes a constraint that key IDs use a hash fragment that matches the key material (or its fingerprint, for larger keys):

did:v1:...#<multibase encoded public key>

dhh1128 commented 4 years ago

@peacekeeper and @talltree and @dlongley : Thanks for your meaningful engagement. Please be patient while I push back a bit longer.

I don't need the DID spec to change to accommodate the CRDT requirements of the did:peer spec. I can already do what I need to do there. Rather, I'm trying to clarify something important that I learned as I thought about CRDTs in that spec, that I think would benefit all DID methods. And I don't think it's a minor benefit. I think it's vital to cybersecurity, and quite helpful to implementers that begin caching.

Far from being new info to me, Drummond's use case about rotating the URI for a service has been obvious and weighing on me since I started this thread. Of course we want that.

But I think we are all missing something here because we think like honest, rational human beings. That is a classic way that honest, rational human beings get taken advantage of by hackers and other malicious parties.

Today, we have the thing that Markus called "portability". Here's what can be done with it:

Alice, an innocent and honest person, can define a DID doc how she likes. Then her government can torture her until they get just enough control over her digital identity to be able to change her DID doc, and because of our "portability" feature, they can then change anything in Alice's DID doc except for the root id property, and result will still be viewed as Alice's DID doc. We have talked about this scenario before, but what I've recently realized is just how subtle the abuse cases can be that we are tolerating. For example, instead of them adding new keys to Alice's DID doc, or removing old ones, why wouldn't they just rewrite existing keys, changing the value or type of the key with id=XYZ? That way, parties who have already interacted with Alice's key having id=XYZ will not perceive that any important change has taken place. After all, rational and honest human beings would never change the type or value of a key after they define it... Such a change would be detectable with any good DID method impl... eventually. But by then, abuse of Alice's trust may be far down the road.

Another scenario.

Alice puts an XDI service endpoint in her DID doc. A vulnerability is discovered in a DIDComm protocol for all service endpoints of type "DIDComm agent"--but happily, Alice's endpoint at did:example:12345;xdiservice is invulnerable since it's running XDI. Everybody interacting with Alice breathes a sigh of relief and crosses that endpoint off the list of endpoints they need to worry about. Then a hacker who's been lurking in Alice's world manages to rewrite Alice's DID doc, changing the type of that endpoint from XDI to "DIDComm agent". This is possible because of our "portability" feature. Software that visits that endpoint (used the cacheable invariant URI that @talltree touted), and that has cached its type as being on the safe list, can now be victimized by the bait-and-switch.

Now, I'm not suggesting that we can totally eliminate these problems just by being more aggressive about immutability. Hacking and interference with personal sovereignty will be always be a risk. But I bring up these examples to highlight something that I think we're misunderstanding when we tout "portability". It is this:

The whole point of "portability" is to allow something to change while holding constant something that's semantically significant. If everything can change except the id that provides invariance, we've gone too far.

Many of you have met me in person. If, tomorrow, the US government introduced to you a woman in her early 20s whose native language was Tagalog and who had never met you, and claimed that that person was me, I hope you would raise your eyebrows. Far too much of semantic equivalence has changed for that substitution to feel appropriate, even if an identifier like a social security number or the name "Daniel Hardman" is held invariant.

If, today, I visit a website registered at example.com, that serves scientific research info, and tomorrow I visit example.com and it's owned by a different entity and now serves an online gambling portal, we've held one thing invariant--the domain name--but there is nothing of semantic significance that has held invariant along with that identifier. The portability is useless, imposes a needless burden on caching, and, in its more subtle forms, is dangerous.

This is exactly the situation with our current spec. The only thing we hold invariant is the id property of a given JSON object; everything else is editable.

My original comment suggest that we hold all things with id invariant. Perhaps that's going too far the other direction; as Drummond pointed out, the DID doc itself can't conform to that standard. But I think we ought to be pretty close to that extreme; any place where we back off of it is a place where we open up abuse cases without really protecting a feature.

Let me justify that last claim, "without really protecting a feature." I could achieve exactly the portability (ability to change URI of service endpoint) that Drummond actually wants, and none of the dangerous portability that goes along with it in our current approach, as follows:

  1. Define a bunch of JSON objects in the DID doc that associate a URI with an id:
{
    // ... other parts of DID doc ...
    "uris": [{"id": "uri1", "value": "http://abc.xyz"}, {"id": "uri2", "value", "http://example.com/foo"}]
}
  1. Define another place in the doc where mappings of service endpoints to URI objects is expressed:
{
    // ... other parts of DID doc ...
    "endpointMappings": [{"id": "mapping1", "service": "#xdiservice", "uri": "#uri1"}]
}
  1. Change the current definition of service so it doesn't include an endpoint at all, but rather it includes only things that we intend to keep semantically invariant, such as type.

Now, I still have the ability to change the URI that's associated with an endpoint, by deleting the endpointMapping object with id=mapping1 and adding a new mapping in its place. But I no longer have the ability to change the type of a service with a given id. This is much safer, because it correctly represents the innocent expectations of honest, rational people. It leaves no room for the bait-and-switch hack I described above. And it simplifies caching, because the caching engine only has to answer one question: is a given id still present (undeleted) in a particular version of the DID doc, INSTEAD OF, how was this id defined in a particular verison of the DID doc. Under this approach, there is only ever one version of data for a given id.

Now, I admit that the approach I've just described is more complex to describe, and I admit that it is a meaningful change to a DID spec that is somewhat late in its evolution for something so significant. Perhaps you will decide the juice is not worth the squeeze. But at a minimum, I strongly recommend that the spec start saying which properties of which objects are supposed to be immutable. All properties of JSON objects in the publicKey section likely fall into that category. In earlier comments in this thread I tried to argue that the type property of services probably falls into that category as well.

I also predict that, if we don't switch over to a full immutability strategy, we will begin to regret it when we implement robust caching. In the meantime, we may pat ourselves on the back at having avoided some of the extra work that Daniel is advocating, but when we actually have a global fabric of DID resolvers and caching happening in multiple layers of network stacks around the world, I believe that there will be a stark difference in debuggability, complexity, and security for an everything-is-immutable-so-if-you-know-what-an-id-maps-to-you-have-its-correct-cache-value strategy and an theres-no-way-to-know-whether-youve-cached-the-right-value-for-an-id strategy.

dhh1128 commented 4 years ago

Per a discussion on the DID spec call today, I will be closing this issue and raising a PR that suggests some verbiage about the topic of immutability in the Security Considerations section of the spec. I will keep the issue open until the handoff to a PR is complete; we can then discuss the PR content in its comment stream.

dhh1128 commented 4 years ago

Now that PR #240 is open, I am closing this issue.

peacekeeper commented 4 years ago

If type is invariant, then the way you update what is semantically "my DIF hub endpoint" is not to edit it. Instead, it's to delete the old item and to introduce a new one that has the same type but a different id property. When you do this, you explicitly invalidate all caches (which are id-based).

Thanks for explaining patiently, but somehow I still don't get it.. If I delete an old service of type "hub" and then introduce a new service of type "hub", then I have essentially "edited" the meaning of "my hub endpoint". How is this more secure than editing the meaning of "service with id #3635"?

Who says caches are id based? Why can't caches be invalidated simply with any update to a DID Document, or based on ttl?

I don't really get why the endpointMapping structure adds more security either, but I feel I'm missing something important, so I'd like to spend some more time on the topic. Perhaps we can discuss this at RWoT if you go there?

peacekeeper commented 4 years ago

Have a look at my ActivityPub "Person" object:

curl -H "Accept: application/ld+json" https://chaos.social/users/peacekeeper

It contains the following:

    "publicKey": {
        "id": "https://chaos.social/users/peacekeeper#main-key",
        "owner": "https://chaos.social/users/peacekeeper",
        "publicKeyPem": "-----BEGIN PUBLIC KEY-----\nMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA6YR+dduoefVDWXtoKHqG\n3XIHxHExBozUTPrxUQl8L08HuQdxAw/v9dq4XqRzfJfSuGKllTiDQZFOZBJ74BLn\nvlUQXkCGTh6BTbHGkrNUGdCbNgi5g/Z3XDZTtnimoJkqvTCQB9ENImjVgZpgV/c/\ny0PKIymKpKC0MW76hcp8K7ywmaa3x2Rf7TDakBtOR2bQ0OpuzDrfnICQRVNlE4lj\nIi9WmTh/WD+sUoaoC+P/dn6FCFZvp+srZ1TknMMGyNRYAG7KDSumrIgk7XtCoXSs\n0gnUswh3l+8YYjKjOqpGBC93m6rLiJJ5zqtu3DJOgaDGWAtxhfriDOc6fpbDsyyp\nPwIDAQAB\n-----END PUBLIC KEY-----\n"
    },

So there's a "main key" with URL https://chaos.social/users/peacekeeper#main-key.

I could rotate the "main key", I could even replace my RSA key with an ed25519 key and it would still be known to the world as my "main key".

(Just bringing this an additional input, not to argue what we should do here. @dhh1128 would this have the same security problems as in the DID URL case?)

dhh1128 commented 4 years ago

Yes, let's discuss this in person in front of a whiteboard. Discussing in a comment stream is too time-consuming.

On Fri, Jul 19, 2019 at 3:55 PM Markus Sabadello notifications@github.com wrote:

If type is invariant, then the way you update what is semantically "my DIF hub endpoint" is not to edit it. Instead, it's to delete the old item and to introduce a new one that has the same type but a different id property. When you do this, you explicitly invalidate all caches (which are id-based).

Thanks for explaining patiently, but somehow I still don't get it.. If I delete an old service of type "hub" and then introduce a new service of type "hub", then I have essentially "edited" the meaning of "my hub endpoint". How is this more secure than editing the meaning of "service with id #3635"?

Who says caches are id based? Why can't caches be invalidated simply with any update to a DID Document, or based on ttl https://github.com/w3c-ccg/did-resolution/issues/10?

I don't really get why the endpointMapping structure adds more security either, but I feel I'm missing something important, so I'd like to spend some more time on the topic. Perhaps we can discuss this at RWoT if you go there?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/w3c-ccg/did-spec/issues/238?email_source=notifications&email_token=AAQ3JCBRAZ7HPY76UDVKZ2LQAIZ3LA5CNFSM4IDYC42KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2M3VOA#issuecomment-513391288, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQ3JCF6SWLP3EUEEGR7RPDQAIZ3LANCNFSM4IDYC42A .

TallTed commented 4 years ago

@dhh1128 @peacekeeper -- Please bring the whiteboard content back here, somehow, as I'm also not following @dhh1128's reasoning, and thinking along what I think are similar lines as @peacekeeper.