w3c / did-core

W3C Decentralized Identifier Specification v1.0
https://www.w3.org/TR/did-core/
Other
407 stars 95 forks source link

Change serviceEndpoint from string-or-object to array-or-object, or just an object with a URI array property #359

Closed csuwildcat closed 4 years ago

csuwildcat commented 4 years ago

The serviceEndpoint property is already a polymorphic property, allowing a value that may either be a string URI or an object with custom properties. It appears that in many use cases, the most desirable value type is an array of URI strings. This is the case with just about any service type where there are multiple endpoints, instances, or locations involved. I would propose one of the following resolutions be adopted:

  1. The serviceEndpoint property can be a URI string, an array of URI strings, or an object with custom properties.
  2. The serviceEndpoint property can be an array of URI strings or an object with custom properties (instead of adding another type, just make the default an array of URIs, wherein people can just add one if they don't have multiples)

Please let me know what you think, because right now we probably going to see two of the most common service types (DID-to-domain linkage and SDS/Hubs) use a custom object for this property just so they can specify multiple endpoints.

agropper commented 4 years ago

Everything in the DID Document is public and will be crawled and indexed by thousands, maybe millions of “directories” and other data brokers. Even for “private” DIDs this kind of thing poses a large correlation risk.

On the other hand, I don’t understand why this would be useful to anyone. Are we just trying to reduce the cost of running a search engine to a minimum? If not, then adding a mediator or policy decision point as a best practice for service endpoints seems reasonable.

csuwildcat commented 4 years ago

On the other hand, I don’t understand why this would be useful to anyone. Are we just trying to reduce the cost of running a search engine to a minimum? If not, then adding a mediator or policy decision point as a best practice for service endpoints seems reasonable.

It's not about reducing costs, it's about providing a value signature for Service Endpoints (whatever their function) that serves the most frequent cases. One case we're thinking about is a service endpoint that describes a list of Web origins that are associated with the DID, for example: a company may have a DID and own three different websites. If they wanted to use the DID Configuration mechanism of linking to the proofs of control across those sites, they would want to list the three Web origins of those sites in the DID Configuration Service Endpoint. Today, we would have to add an object to that value and some subproperty that takes an array, all because we can't just add three strings in an array directly. Same goes for personal datastores: if a DID owner wants to link to 1+N instances of their personal datastores, we will again need to define a custom service endpoint object with some random subproperty that has an array value, all because we can't add an array with URI strings directly.

csuwildcat commented 4 years ago

As far as things being crawlable and discoverable in a DID Document: I do not think at this point it is productive or cogent to argue that one should oppose expressing endpoints that allow for discoverability of intended-public data, as I, and others, have provided countless examples of where that is not only a positive thing, but an essential thing that both users and customers need and want. Not everything in the world is a 007 secret decoder ring use case, in fact, most are the opposite: people and businesses want others to know things about them and how to facilitate public interactions.

agropper commented 4 years ago

Obviously, DID Core has a choice of how we present our intent through the specification. If it's not "privacy by default" then we need some rubric to decide how far we go in facilitating data brokerage.

What will be the rubric that allows us to finalize the privacy aspects of the core specification

csuwildcat commented 4 years ago

I don't think this:

{
  serviceEndpoint: {
    locations: [1, 2, 3]
  }
}

is in any more private than this:

{
  serviceEndpoint: [1, 2, 3]
}

The former is just a needless pain in the butt for anyone trying to do sensible, normal things with service entries.

agropper commented 4 years ago

I have no horse in a race that does not involve privacy.

My concern is best practice or normative text that will drive people toward privacy by default which, to me, means keeping all personal information out of the DID document behind a mediator and/or an authorization server.

csuwildcat commented 4 years ago

@msporny @talltree @OR13 any reason this can't allow an array, given it already allows a string and an object?

csuwildcat commented 4 years ago

If no one is opposed, can I do a PR to allow an array?

OR13 commented 4 years ago

Looking at the current examples: https://w3c.github.io/did-core/#example-23-various-service-endpoints

It seems like there is no reason to allow an array.... because there is already support for objects.... the less type checking for string || array || object, the better for everyone (regardless of language).

@csuwildcat can you just make this work:

 {
    "id": "did:example:123456789abcdefghi#hub",
    "type": "IdentityHub",
    "serviceEndpoint": {
      "@context": "https://schema.identity.foundation/hub",
      "type": "UserHubEndpoint",
      "instances": ["did:example:456", "did:example:789"]
    }
}

If the "instances" are really all of the same type "IdentityHub"... then why not group them under an object?

If they are not, then don't put them in the same service definition.

@msporny @dlongley you mentioned having some opinions about why services might not be necessary at all... IMO now would be a good time to hear about them... in particular...

How are the following situations solved for in a world without services...

  1. Inboxing for messages when the controller is offline.
  2. API / Integration points for Agents / AI systems or proxy services to user controlled applications.
  3. Content Discovery related to Credentials / Workflows or Related identities (the DID equivalent of .well-known URIs).

These are all features which we need, and I am having trouble understanding how we would get them without the services block in the did document.

csuwildcat commented 4 years ago

@OR13 this means that many services will create their own object definitions for their types and all pick different property names just for the sole purpose of having multiple URIs. Some will pick uri others endpoints, still others uris, etc. How can this possibly be the right answer? Can we remove the string type altogether and say it has to be an object, then designate a property that has is supposed to be an array of URIs?

OR13 commented 4 years ago

@csuwildcat its the outer type that matters, and the object defintion of the serviceEndpoint which is controlled by it... for example... type: JsonWebKey2020... publicKeyJwk... (RSA / EC / OKP).... What we are doing here is no different.

and we have a registry for people to register the different types of services.

I see no reason to add more type complexity by saying that serviceEndpoint can be string, array or object...

if you need complexity / extensibility, use an object... if you don't use a string.

if you want an array... you have one already... its called services....

Counter proposal.. remove object option and require serviceEndpoint to only be a string, and use the services array, if you need multiple things of the same type... thats even simpler... and it solves the same issue "I want an array of serviceEndpoints" of a specific type.

services: [{
    "id": "did:example:123456789abcdefghi#hub1",
    "type": "IdentityHub",
    "serviceEndpoint": "https://hub1.example.com"
},{
    "id": "did:example:123456789abcdefghi#hub2",
    "type": "IdentityHub",
    "serviceEndpoint": "https://hub2.example.com"
}]
dlongley commented 4 years ago

@OR13,

Some arguments against services in DID Documents:

First, it is difficult to get service endpoints in public DID documents right for privacy preserving use cases, i.e. any PII on a verifiable data registry is problematic, especially for registries that do not/cannot support deletion. Yes, there are cases for DIDs that refer to people's public personas (like social media identities or whatever), but read on.

Second, DIDs and DID Documents out of context are just bits of random and/or untrusted information. DIDs aren't known to requesting parties a priori. Also, no one scans a verifiable data registry and picks a DID to do something with. Rather, DIDs are bound to other contextual information that makes them useful. That information comes from some other source: either an entity interactively responding to a request for a DID or some other kind of information registry that is being scanned based on some additional information. In both cases, that source can provide additional information like "service endpoints" in a more privacy preserving way. There is also often room for more fine-grained/contextual consent.

Since these other sources of information must always exist -- then it would be nice to put the types of data that may be considered PII where the constraints mentioned in above (for verifiable data registries) do not apply ... along with that additional information that contextualizes the DID.

Third, for some use cases where people might be envisioning a requesting party making a call out to a service endpoint to get some information -- take note that this is a security risk for that party. It would be better to hand the information over to that party rather than them making a call out to an untrusted service that could be used to DoS them. IOW, if the requesting party is already asking for a DID, rather than just giving them a DID and letting them use a service endpoint to get more data, have the requesting party ask for the additional information as well. It could even be potentially handed over as a VC that was issued by a party the requester trusts.

Additionally, the requesting party might need some authorization to even access your service endpoint. This means you either have to have some software add them to an ACL at your service endpoint -- before you hand over your DID -- or you have to hand them an authorization (like a zcap) to access it. And what specific thing are you going to grant them access to? Well, they should have asked for that thing anyway so you could know and make a decision about it. In short, there are many use cases where there are interactions and contextual information that precede any interaction a requesting party might have with some service endpoint it finds in your DID document -- and a number of security/trust issues.

So, from this perspective, it isn't the "discovery" aspect of services via DID Documents that is particularly useful here. Rather, what is useful is giving someone a stable pointer to something -- whereby you can independently change what the pointer resolves to. That "stable pointer" could be a DID URL, but given all of the above, maybe we should find another way to create a stable pointer that is decoupled from DID Documents. We may even find a solution with only semi-stable pointers that change via some protocol that ensures the independence/portability we're looking for.

csuwildcat commented 4 years ago

Proposal 1: move from string-and-object as values for the property to array-and-object

Proposal 2: have just one value type, an object, but define within it a property which is an array for specifying one or more URIs.

@dlongley "Also, no one scans a verifiable data registry and picks a DID to do something with." - I'm literally doing exactly this to create a decentralized public package registry, wherein every package is represented by a DID, and you can scan for and eval all of them by crawling the global DID space. As such, I would disagree that no one does this, because I'm doing exactly this right now.

OR13 commented 4 years ago

@dlongley I think i understand....

what you are really saying is... no matter where you are in a protocol, you have an opportunity for the did controller to share an endpoint privately, and then continue on... without the need to disclose the endpoint publicly... ever.

If I'm doing OIDC SIOP, I an hand the RP an id_token which contains my preferred service endpoint.... If I'm doing CHAPI I can hand the web app a VP that contains a service endpoint for follow up interactions.... If i'm doing DIDComm I can advertise my service endpoint using did parameters in the invitation.... at no point do i need to publicly disclose a public internet web server endpoint in order get the benefits we currently think of as coming with service endpoints.... however.... what is required is documentation around how those flows can be used .... for example, authenticating and requesting credential manifests from a credential provider via OIDC with DIDs... or doing the same with CHAPI or DIDComm.

In fact... I wrote this proposal here, which allows you to add a service endpoint (with authentication) to any did document.... https://github.com/decentralized-identity/did-spec-extensions/blob/master/parameters/signed-ietf-json-patch.md

....without ever publishing the change to a Verifiable Data Registry (blockchain / GDPR encumbered technology)

And there are tests / test vectors showing how it works... https://github.com/decentralized-identity/did-spec-extensions/tree/master/parameters/signed-ietf-json-patch-example

I'm convinced that service endpoints don't need to be publicly registered in the verifiable data registry....however, I think they still need to be formally and normatively described in the data model.

csuwildcat commented 4 years ago

@dlongley: in your response, please don't suggest I should have to go register the DIDs with some centralized registry, because the goal is explicitly to have a decentralized registry, or that I should have to register with another decentralized registry in addition, which just forces creation of yet another registry that would do the exact same thing.

OR13 commented 4 years ago

For the record, I have noted repeatedly that what @csuwildcat is doing is unsafe / not a good idea... you can read more about it here: https://github.com/decentralized-identity/ion/issues/77 ... its basically the equivalent of treating the bitcoin blockchain as a decentralized index for things... where anybody can lie, and anyone can update the index.... however... its probably good to have someone experiment with writing DID subject types directly to an immutable ledger... because we will all be able to learn if it works or not :)

OR13 commented 4 years ago

@csuwildcat so back to your original question... I don't think we should change the type of serviceEndpoint to be an array.

I would be in favor of making it only a string, or only an object.

csuwildcat commented 4 years ago

Forcing DIDs to be registered in yet-another-decentralized-registry makes no sense, as it's literally the exact same thing, but doubles the effort and networks/code involved. Additionally, removing the service endpoint section would encourage either duplication of effort or registration with a centralized registry intermediary actor, and due to the possibility of the latter, I would move to oppose service endpoint removal in the strongest terms our official capacity will allow.

csuwildcat commented 4 years ago

@OR13 I'd be in favor of only an object...but with a standard property that allows for one or more URI strings 😂

dlongley commented 4 years ago

@csuwildcat,

Forcing DIDs to be registered in yet-another-decentralized-registry makes no sense, as it's literally the exact same thing...

There are different requirements for storing service endpoints than for DIDs themselves and for verification methods. Different requirements imply that "it's literally the exact same thing" is not actually true. Sure, "any registry stores information" is true, but characteristics like "the type of information" or "how persistent the storage must be" are important differences here. These differences are the source of the debate here, so we can't just assume them away.

It may be that another decentralized registry that has the appropriate characteristics (and that is decoupled from the one that doesn't) is a good solution for your use case. It would certainly be better if people only registered packages in that registry so you didn't have to crawl all DIDs in existence to find the ones that happened to refer to packages.

csuwildcat commented 4 years ago

Proposal: Remove the string value and make the serviceEndpoint property take only an object, which MAY include a standard property named uri, the value of which is an array of one or more URI strings - for example:

"serviceEndpoint": {
  "uri": ["https://foo.com", "somescheme://bar"]
}
agropper commented 4 years ago

I've been obsessed with the idea of eliminating service endpoints for the past three hours. It might simplify privacy engineering immensely and be a generative move for SSI. I don't understand all of the points that have been made above, but I can try to give my non-technical perspective.

Eliminating service endpoints from the DID Document would mean:

The association between DIDs and services would be out of scope for DID:core.

The question of service discovery and DID discovery could be taken up by various existing workgroups, including the registry issues that I don't yet understand. I don't see any reason to overlap that with the issues of authentication and digital signatures that seem to stand alone around DIDs.

On Fri, Aug 7, 2020 at 9:10 PM Daniel Buchner notifications@github.com wrote:

Proposal: Remove the string value and make the serviceEndpoint property take only an object, which MAY include a standard property named uri, the value of which is an array of one or more URI strings - for example:

"serviceEndpoint": { "uri": ["https://foo.com", "somescheme://bar"] }

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/359#issuecomment-670801545, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YKNN4OS62UOOEVQPY3R7SQYXANCNFSM4PGYZMYA .

csuwildcat commented 4 years ago

@dlongley "It would certainly be better if people only registered packages in that registry so you didn't have to crawl all DIDs in existence to find the ones that happened to refer to packages." - if it's another decentralized registry, you'd still have to crawl all of the entries and do the exact same evals of off-registry credentials, because it's decentralized...surely you realize this?

csuwildcat commented 4 years ago

@dlongley if a registry is decentralized, it means you can't evaluate and filter dynamic evidence on ingest, you have to do it after initial commitment and after any state changes. This means accomplishing the stated goals without service endpoints in the DID Doc would be a near identical duplication of code, systems, and effort, wherein I walk a DID over to some registry that is literally just a giant, open list of unfiltered DID entries paired with endpoints that anyone can add to (you know, because it's decentralized...). Look folks, I realize many people in the DID community are creating centralized or semi-centralized DID networks and layers, and that's fine, to each their own, but please don't hobble the spec so it makes life harder for community members who actually want to create decentralized, uninterdictable infrastructure and substrates.

dlongley commented 4 years ago

@csuwildcat,

if it's another decentralized registry, you'd still have to crawl all of the entries and do the exact same evals of off-registry credentials, because it's decentralized...surely you realize this?

There are many legitimate uses for DIDs that do not refer to packages. Let's call these sorts of DIDs "Set A". There are DIDs that legitimately refer to packages, let's call that "Set B". There are DIDs that masquerade as referring to packages but are not legitimately packages. Let's call that "Set C".

If Set A is quite large, it would be nice not to have to include it in your crawls, right? In a magical future world of DIDs, Set A may even be significantly larger than Set B -- perhaps even most DIDs, by far, will be in Set A. It seems like a clear win to remove Set A from what you have to crawl.

Now, even if you cut out Set A, you'll still have to deal with separating out Set B and Set C. But, if your registry is specifically designed to store only references to packages, perhaps there's even some additional validation (or incentives/disincentives) that could be used to cut back on Set C that would not be possible if your registry also had to support Set A.

csuwildcat commented 4 years ago

"If Set A is quite large, it would be nice not to have to include it in your crawls, right?" - absolutely, it would be great to skip resolution for the majority of DIDs that are claimed to represent other types of things...and luckily we added a 4 byte type field to ION so you can flag your DID as some type of non-human thing, such that you only need to scan for, resolve, and eval the subset you care about.

csuwildcat commented 4 years ago

I will note a few details about the optional type field we added to ION DIDs:

  1. It is exclusively for non-human types
  2. It is immutable, and part of the inception event, bound into the DID forevermore
  3. The type field is a method-specific convention that is not echoed out into the DID Document, and present at the fastest indexing layer within ION. (we did this after folks asked us to make it that way, not because we want to play keep away or anything)
OR13 commented 4 years ago

you can't mandate what a decentralized indexing system is used for.... thats the whole point of censorship resistance... if tomorrow some baddies decide they will use that type field for tracking protestors, journalists or compromised hosts in their botnet.... you can't stop them... however, they can accomplish the same kind of thing with any DID Method, they just need to mine for a method specific identifier that has the leading bytes they want... and then keep a table where:

e7 = journalists e8 = terrorists e9 = compromised hosts ....

The difference with ION is that Microsoft appears to be planning to create its own labels for these, like:

e7 = docker images e8 = mobile devices e9 = packages...

When you dump the index for e9... it will contain valid tagged dids, accidental tagged dids, and dids that baddies intentionally tagged as "packages" because they are trying to hide compromised hosts in what is supposedly a legitimate index of packages....

As I understand, the solution will then be to rely on Verifiable Credentials issued by Microsoft and others to help distinguish e9 dids that are packages, from e9 dids which are compromised hosts.... I am eager to see if this will actually work.... and as I said, you can do the same thing with any did method... you just need a table and patience... and yes, it means that Microsoft (and probably the FBI and others) will be crawling the entire ION DID registry and bucketing DIDs based on what the controller claims the did is for..... This is already a pretty common thing with public ledgers like bitcoin and ethereum.... its used to track ransomeware payments, and tainted bitcoins from dark markets... surveillance is the default on public networks... anyone can claim to be a package, and they will automatically get included in the "packages" index, but without the right verifiable credentials they won't be a very trust worthy package...

OR13 commented 4 years ago

So in other words, the indexing system will be resolving dids, using their service endpoint to ask them for proof they are a package, and the ones that can prove they are packages endorsed by Microsoft or others, will get little green checkmarks next to them... as real packages, vs just claiming to be packages... and you can do the same thing for "verified twitter handles", or "doctors" or "citizens who have not been arrested for protesting"... The index just makes it easier to get to a group of dids claiming to be of a "type"... and once you have that group, you can start asking whatever questions you want... like... "prove you are really of type package"... in my opinion, its terrible privacy engineering, but we will see how it works....

@csuwildcat can you explain how this related to service endpoints, what are some example URIs for a software package, and why can't you use the spec as is?

agropper commented 4 years ago

What’s a “package”? Does it have anything to do with using a DID for authentication? I’m concerned by the endorsement or attestation points that @Orie is raising and I hope others can help explain it.

I saw this recent article about “attestation” related to authentication standards (WebAuthn, FIDO) https://macsecurity.net/view/391-safari-14-will-introduce-face-id-and-touch-id-for-the-web It includes:

“To address known security issues, Apple engineers have masterminded a proprietary attestation service that laces each credential with a unique certificate.”

DiDs are designed to be opaque, other than their method. A DID can tag anything and anyone. Many DIDs might not be used beyond a single domain or origin. Why are we talking about registries and packages?

On Sat, Aug 8, 2020 at 1:48 AM Orie Steele notifications@github.com wrote:

So in other words, the indexing system will be resolving dids, using their service endpoint to ask them for proof they are a package, and the ones that can prove they are packages endorsed by Microsoft or others, will get a little green checkmarks next to them... as real packages, vs just claiming to be packages... and you can do the same thing for "verified twitter handles", or "doctors" or "citizens who have not been arrested for protesting"... The index just makes it easier to get to a group of dids claiming to be of a "type"... and once you have that group, you can start asking whatever questions you want... like... "prove you are really of type package"... in my opinion, its terrible privacy engineering, but we will see how it works....

@csuwildcat https://github.com/csuwildcat can you explain how this related to service endpoints, what are some example URIs for a software package, and why can't you use the spec as is?

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/359#issuecomment-670828628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YIPUBCPWJXBHA5Z6KLR7TRLJANCNFSM4PGYZMYA .

OR13 commented 4 years ago

@agropper this is a package: https://www.npmjs.com/package/msrcrypto

in the future, it might be https://packages.ion-did.com.com/package/msrcrypto ...

When you visit that page it will have instructions like this:

https://packages.ion-did.com/d/did:ion:Ei123...?service=github&relativeRef=/OR13/dndx/master/mod.ts

Real world working example: https://did.actor/carol/did.json https://github.com/OR13/dndx/blob/master/packages/did-dereferencer/mod.ts

^ in this case, the did document would have a service with endpoint github.com and this URL ^ is a "DID" controlled endpoint for resolving a Deno module.... https://deno.land/manual#introduction

This is one way of doing this.... I am trying to understand the exact services block that @csuwildcat wants to use.... but guessing, it probably looks something like this:

{
  "@context": [
    "https://www.w3.org/ns/did/v1",
    {
      "@base": "did:ion:Ei4123...."
    }
  ],
  "id": "did:ion:Ei4123....",
  "publicKey": [
    {
      "id": "#signing-key",
      "type": "JsonWebKey2020",
      "controller": "did:ion:Ei4123....",
      "publicKeyJwk": {
        "crv": "secp256k1",
        "x": "Z4Y3NNOxv0J6tCgqOBFnHnaZhJF6LdulT7z8A-2D5_8",
        "y": "i5a2NtJoUKXkLm6q8nOEu9WOkso1Ag6FTUT6k_LMnGk",
        "kty": "EC",
      }
    }
  ],
  "keyAgreementKey": [
    {
      "id": "#encryption-key",
      "type": "JsonWebKey2020",
      publicKeyJwk: {
        kty: 'OKP',
        crv: 'X25519',
       x: 'pE_mG098rdQjY3MKK2D5SUQ6ZOEW3a6Z6T7Z4SgnzCE'
    },
    }
  ],
  "assertionMethod": ["#signing-key"],
  "service": [{
    "id":"#hub",
    "serviceEndpoint": ["https://hub1.microsoft.example.com", "https://hub2.google.example.com"]
  }]
}

You would then use the hub for EVERYTHING, including asking for credentials / getting proof that the above did document subject is a package, and handling url resolution for public content, like source code for the package....IPFS might be under the hood here / or github like in the example above.

Just like how anybody can run an OIDC server, but the nascar problem leads to only a few Open ID Providers that are well known.... consider how many hubs will exist in did documents 10 years from now? it's possible people will be running their own... but not likely, since services on the public internet are constantly under attack, and there is a reason we have a few cloud providers who manage cloud infrastructure and security issues for all of us....

However, this beside the point.... here are the questions we are grappling with on this issue

  1. Is it ever a good idea to build a "phone book" of all DIDs and their subject types (business / software package / etc)
  2. Is it ever a good idea to have public service endpoints published to a verifiable data registry?

Daniel's package manager solution assumes that there is value in crawling VDRs and indexing all public information, and possibly automatically following links deeper to learn if the DID Subject is a real business or real software package.

Dave and I are noting that you don't need a phonebook to get proof that a DID is a board certified physician, you can show up in a channel and they can pass you links to centralized registries or verifiable credentials, without ever publishing that they are a doctor and they have a hub....

This website (https://www.shodan.io/) indexes all internet connected devices....on the planet... so when your web cam is internet enabled but you didn't secure it properly... attackers can see that its internet enabled, and they can write scripts that crawl this registry and try to break into every device listed.

^ you can do the exact same thing with DID.... read the whole VDR, choose an order to attack DIDs, resolve, them, start making network requests to their service endpoints, and start trying admin interfaces / default passwords.... anyone who has ever run a web server on the internet knows this is the future we are facing with serviceEndpoints in did documents.

Since we are on this topic, did:peer is a private did method that does not publish services like this, and that can be used for didcomm.... @dhh1128 not sure if you want to add anything here regarding privacy, but I know you have thought about this a lot.

OR13 commented 4 years ago

My question for @csuwildcat is why are you trying to make the did document so nested... what is wrong with this:

"services": [{
    "id":"#hub1",
     "type":"Hub",
    "serviceEndpoint": "https://hub1.microsoft.example.com"
},{
    "id":"#hub2",
     "type":"Hub",
    "serviceEndpoint": "https://hub2.google.example.com"
},{
    "id":"#hub3",
     "type":"Hub",
    "serviceEndpoint": "https://hub3.apple.example.com"
},{
    "id":"#hub4",
     "type":"Hub",
    "serviceEndpoint": "https://hub4.amazon.example.com"
}]

^ IMO this is what you want to be doing....

csuwildcat commented 4 years ago

@OR13 Let me try to address multiple things you asked, mentioned, etc.

  1. Yes, I was hoping that serviceEndpoint property could take an array of URIs as you highlighted in your first example, because with Well-Known DID Configuration and the upcoming Hubs stuff, we already have two major types that will want to add multiple URIs. Your later example, though accurate in relation to today's spec language, is cumbersome, forces an odd separation of URIs that may all want to be treated as one bundle, and could be eliminated by simply switching the serviceEndpoint property to an array instead of a string, so you can add one or more URIs of the given type.

  2. Your general replay of our work is correct: I believe DIDs are for all sorts of things, like cars, code packages, companies, etc., not just humans who must operate like James Bond on double secret probation with 14 layers of authorization servers and a bat-signal to dead drop out-of-band communications to Q while being tailed by a super villain. Clearly others on the thread are only concerned with a small subset of entities using DIDs, and using them in a narrow way. Again, it's fine to be very limited in one's thinking (not you, Orie); all I'm requesting is that folks stop trying to hobble anyone who sees beyond the subset of use cases they are interested in.

  3. In specific response to your (@OR13's) last example of what one could do with the spec today, That's not what we would do if we couldn't have a simpler way to add multiple URIs - we'd just go this route:

{
    "id": "#hubs",
     "type": "Hub",
    "serviceEndpoint": {
        "uri": ["https://hub1.apple.com", "https://hub2.amazon.com"]
    }
},
{
    "id": "#linked-domains",
     "type": "LinkedDomains",
    "serviceEndpoint": {
        "uri": ["https://foo.com", "https://bar.com"]
    }
}

Is it unfortunate that we will end up with a pseudo-standard convention where everyone uses an object for their service types just so they can have a prop that allows more than one URI? Yes, it's quite sad indeed. Will I happily gobble that reality up, chew it to bits, and spit out the most efficient thing in response? Yes, yes I will.

dlongley commented 4 years ago

@csuwildcat,

Clearly others on the thread are only concerned with a small subset of entities using DIDs, and using them in a narrow way. Again, it's fine to be very limited in one's thinking (not you, Orie); all I'm requesting is that folks stop trying to hobble anyone who sees beyond the subset of use cases they are interested in.

I'm not sure who you're talking about.

Is it unfortunate that we will end up with a pseudo-standard convention where everyone uses an object for their service types just so they can have a prop that allows more than one URI? Yes, it's quite sad indeed. Will I happily gobble that reality up, chew it to bits, and spit out the most efficient thing in response? Yes, yes I will.

You argue for efficiency here. But, similarly, the arguments made above for removing service endpoints have to do with using more efficient solutions for non-crawling use cases (they also have other benefits like better security and privacy). To reiterate them a bit: if the requester is already in a channel where the information can be passed, let's avoid making them have to go out and fetch it from an untrusted service.

Now, for the VDR crawling case, arguments were made that even the approach that's being taken could also be solved more efficiently (and securely). And, if this wasn't clear already, just think about all of those bits that don't have to stick around forever on a blockchain when data changes because it's on a registry that supports deletion.

So, please don't present a strawman that people are ignoring you and are only concerned with catering to James Bond, especially whilst you argue the merits of efficiency on a different point of contention. Well, I shouldn't go so far to say don't present it -- because I really did enjoy your densely populated superspy lexicon -- but please also say "just kidding, I know the people here are good listeners and are trying to cover the use cases".

OR13 commented 4 years ago

If you want to add a service endpoint to a did document, without publishing it to a registry, you can use this: https://github.com/decentralized-identity/did-spec-extensions/blob/master/parameters/signed-ietf-json-patch.md / or initial state... or something else....

If you want to publish service endpoints to a blockchain, nobody can stop you...

I remain convinced that the serviceEndpoint type should be string or object (not array), and daniel's use case has convinced me the object is even more valuable, since it allows for additional properties / context (origin is more descriptive that URIs, other object types might prefer to use other more specific terms.... think about DIDComm and agents, some of which may prefer not to speak HTTP)

I am also convinced by Dave and Adrian's points that services should generally not be published on VDRs, and instead you should request access to them in an established channel (like DIDComm / CHAPI / OIDC / GNAP, etc...)...

I think it would be a big mistake not to add a lot of normative text to did core about services.... we should describe them, their data model, their uses, and the security / privacy / discoverability tradeoffs associated with getting them "just in time" in a channel, or publish them to a blockchain where they can never be deleted.... I think that arguments for both cases have been made, and now they need to be added to the correct documents...

This document needs to be updated to reflect this thread: https://github.com/w3c/did-use-cases/

@csuwildcat We should document your package manager use case there.

@dlongley We should document your preferred mechanism for getting services without publishing them to the VDR, and ideally provide a higher level use case for it.... @agropper maybe we can all work together on a separate issue to hash something out.

@dhh1128 @tplooker We should add DIDComm to the use cases document, and we should note how didcomm uses the serviceEndpoint.

IMO this issue should be closed, once we have tickets setup for ^

agropper commented 4 years ago

Many thanks @OR13 !! inline...

On Sat, Aug 8, 2020 at 10:28 AM Orie Steele notifications@github.com wrote:

@agropper https://github.com/agropper this is a package: https://www.npmjs.com/package/msrcrypto

Ahhh. A package is code I can choose to install. Are PWA's a package? Sperm?

in the future, it might be https://packages.ion-did.com.com/package/msrcrypto ...

So npmjs.com and packages.ion-did.com.com are what we're calling a "registry"?

The https:// implies that I want to trust the registry as well as the package. In a zero-trust or content-addressed world, wouldn't we just trust the package?

When you visit that page it will have instructions like this:

https://packages.ion-did.com/d/did:ion:Ei123...?service=github&relativeRef=/OR13/dndx/master/mod.ts

So the DID method is did:ion and a service endpoint of "type" github will be returned?

Real world working example: https://did.actor/carol/did.json https://github.com/OR13/dndx/blob/master/packages/did-dereferencer/mod.ts

So, this enables me to choose a username (carol) for a DID as long as my username is unique in the package registry?

When carol goes to register with her new dropbox service, she types-in https://did.actor/carol/did.json and dropbox stores that? When carol comes back to dropbox to authenticate, dropbox looks up her current public key in the package registry?

As you can see, I'm still confused. What's the DID method in this example?

^ in this case, the did document would have a service with endpoint github.com and this URL ^ is a "DID" controlled endpoint for resolving a Deno module.... https://deno.land/manual#introduction

Is this why we're also talking about PWAs? Is a Deno script equivalent to a PWA?

I'm still confused. We seem to be talking about different "types" of service endpoints: storage, packages, sensors, registries (for pet names), mediators (a registry for globally unique keys), authorization servers (policy decision points). This brings us back to the Glossary Group Survey https://forms.gle/SkAmGFpZLZngMi5k8 Everyone: Please fill this out!

In this thread, are we talking about a particular service endpoint type or service endpoints in general as attributes in a DID document?

This is one way of doing this.... I am trying to understand the exact services block that @csuwildcat https://github.com/csuwildcat wants to use.... but guessing, it probably looks something like this:

{ "@context": [ "https://www.w3.org/ns/did/v1", { "@base": "did:ion:Ei4123...." } ], "id": "did:ion:Ei4123....", "publicKey": [ { "id": "#signing-key", "type": "JsonWebKey2020", "controller": "did:ion:Ei4123....", "publicKeyJwk": { "crv": "secp256k1", "x": "Z4Y3NNOxv0J6tCgqOBFnHnaZhJF6LdulT7z8A-2D5_8", "y": "i5a2NtJoUKXkLm6q8nOEu9WOkso1Ag6FTUT6k_LMnGk", "kty": "EC", } } ], "keyAgreementKey": [ { "id": "#encryption-key", "type": "JsonWebKey2020", publicKeyJwk: { kty: 'OKP', crv: 'X25519', x: 'pE_mG098rdQjY3MKK2D5SUQ6ZOEW3a6Z6T7Z4SgnzCE' }, } ], "assertionMethod": ["#signing-key"], "service": [{ "id":"#hub", "serviceEndpoint": ["https://hub1.microsoft.example.com", "https://hub2.google.example.com"] }] }

So, "hub" would be a standardized service endpoint type different from "storage, package, sensor, registry (for pet names), mediator (a registry for globally unique keys), authorization server (policy decision points)" ?

You would then use the hub for EVERYTHING, including asking for credentials / getting proof that the above did document subject is a package, and handling url resolution for public content, like source code for the package....IPFS might be under the hood here / or github like in the example above.

Confused. Does a DID document have one or more hub endpoints? If it's EVERYTHING then there should be only one like one mediator service endpoint https://github.com/hyperledger/aries-rfcs/blob/master/concepts/0046-mediators-and-relays/README.md or one registry or one authorization server.

Just like how anybody can run an OIDC server, but the nascar problem leads to only a few Open ID Providers that are well known.... consider how many hubs will exist in did documents 10 years from now? it's possible people will be running their own... but not likely, since services on the public internet are constantly under attack, and there is a reason we have a few cloud providers who manage cloud infrastructure and security issues for all of us....

However, this beside the point.... here are the questions we are grappling with on this issue

  1. Is it ever a good idea to build a "phone book" of all DIDs and their subject types (business / software package / etc)
  2. Is it ever a good idea to have public service endpoints published to a verifiable data registry?

Daniel's package manager solution assumes that there is value in crawling VDRs and indexing all public information, and possibly automatically following links deeper to learn if the DID Subject is a real business or real software package.

Dave and I are noting that you don't need a phonebook to get proof that a DID is a board certified physician, you can show up in a channel and they can pass you links to centralized registries or verifiable credentials, without ever publishing that they are a doctor and they have a hub....

I'm familiar with centralized registries as oracles. In healthcare we have the NPI and DEA Number oracles. The post office runs an oracle for Tracking Numbers. These are extremely useful, especially when they have an open public API. Their utility is based on the fact that you can go to jail if you lie to the oracle. What does this have to do with service endpoints?

This website (https://www.shodan.io/) indexes all internet connected devices....on the planet... so when your web cam is internet enabled but you didn't secure it properly... attackers can see that its internet enabled, and they can write scripts that crawl this registry and try to break into every device listed.

^ you can do the exact same thing with DID.... read the whole VDR, choose an order to attack DIDs, resolve, them, start making network requests to their service endpoints, and start trying admin interfaces / default passwords.... anyone who has ever run a web server on the internet knows this is the future we are facing with serviceEndpoints in did documents.

Service endpoints seem necessary in order to point to a resource. How else would I specify the mediator or authorization server? In the case of the authorization server, I could theoretically hand out pointers to my AS but then I would lose the ability to change that pointer in a secure way. So we're back to DIDs.

Since we are on this topic, did:peer is a private did method that does not publish services like this, and that can be used for didcomm.... @dhh1128 https://github.com/dhh1128 not sure if you want to add anything here regarding privacy, but I know you have thought about this a lot

I love did:peer!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/359#issuecomment-670935347, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YLJ2QBEAMFIZZU43Z3R7VOHLANCNFSM4PGYZMYA .

OR13 commented 4 years ago

@agropper There is a lot in your comment... most of what I provided was in the category of "hypothetical example"... with some pointers to experiments I have done to see how achievable things are... I think you interpreted most of it correctly... here are the parts I am a bit worried may have caused more confusion...

... Public Oracles on Centralized Registries vs Decentralized Indexes on Blockchains

did:example:123 is created 5 minutes ago, claims to be a prescription, and contains the following services....

"services":[{
     "id":"#prescription",
     "type":"DeaNumber",
     "serviceEndpoint": "https://drugs.example.com/dea/CC8422965"
},
{
     "id":"#hub",
     "type":"Hub",
     "serviceEndpoint": "https://hub.example.com"
}]

^ this data is now on a blockchain and can never be deleted.... but the data behind https://drugs.example.com/dea/CC8422965 is on a web server, and can be updated / deleted.... similarly all data behind https://hub.example.com can be updated and deleted.

There are 2 questions which @csuwildcat is raising:

  1. Should you be able to tell if a DID is a prescription just by looking at the VDR (he is suggesting yes).
  2. Should you be able to ask for details about a prescription just by looking at the VDR (he is suggesting yes)

These are related to public oracles, in that there might be direct public links (crawl-able) or the service endpoints might be used to ask for those links....

In other words... maybe we don't need to put DeaNumber in the services section.... maybe we can just ask for that from the hub... but regardless, there will be public in formation in a did document, and in daniel's view, you should be able to ask for private information immediately... for all dids that register a hub.

OR13 commented 4 years ago

I have opened the following DID Use Case Tickets:

agropper commented 4 years ago

In other words... maybe we don't need to put DeaNumber in the services section.... maybe we can just ask for that from the hub... but regardless, there will be public in formation in a did document, and in daniel's view, you should be able to ask for private information immediately... for all dids that register a hub.

See https://github.com/w3c/did-use-cases/issues/99

dhh1128 commented 4 years ago

One use of a service endpoint array might be to present multiple endpoints that are semantically equivalent but that represent alternatives with different connectivity profiles (e.g., an endpoint in Europe, an endpoint in North America, an endpoint in Asia; pick the one you like best). If we allowed this, we'd have to clarify whether the endpoint sequence is a true array (ordered) or matches the default JSON-LD convention of being an unordered set. We'd also have to clarify whether we require all items in the array to use the same transport (could one item be https, and one be bluetooth, for example?). How will we enforce whatever semantics we decree?

Assuming we've thought through the above questions and have clear answers, I am largely agnostic on most of the issues raised by this ticket.

csuwildcat commented 4 years ago

@dhh1128 you make a really good point - if we say the recommended way to do this is creating entirely new entries in the top-level service array for each URI of the same service, service types that need to express ordering and other things based on position of their descriptors in relation to others becomes a really gross problem. This is a great reason to make sure we provide the right affordances inside individual service descriptors themselves, so they can define and contain these kinds of rules without encouraging leakage of their functionality/requirements.

OR13 commented 4 years ago

I'd be in favor of using an object to manage the complexity associated with geography / redundancy / ordering... same thing for preferences for protocols... objects are generally better at managing complexity than arrays. and you can use types to ensure that the objects capabilities / expressive properties are all documented fully.

msporny commented 4 years ago

Array of objects and URLs being strings... that's where most of these discussions ended up in the VC spec. I suggest we do the same here, so we end up with something like:

services: [{
    "id": "did:example:123456789abcdefghi#hub1",
    "type": "IdentityHub",
    "serviceEndpoint": "https://hub1.example.com"
},{
    "id": "did:example:123456789abcdefghi#hub2",
    "type": "IdentityHub",
    "serviceEndpoint": "https://hub2.example.com"
}]

serviceEndpoint is always a string that MUST be a URI.

OR13 commented 4 years ago

hmm isn't services already an array of objects... why do we need to allow serviceEndpoint to be of any type other than string or object.

csuwildcat commented 4 years ago

@msporny I am in favor of simply leaving this as is, with a string or object, given the alternatives haven't leaned in the direction I was hoping (either all three types, or just an object). Can I close this with no additional action required?

msporny commented 4 years ago

@msporny I am in favor of simply leaving this as is, with a string or object

Fine by me.