Service Endpoints in the DID Doc might be an anti-pattern

w3c / did-core

W3C Decentralized Identifier Specification v1.0

https://www.w3.org/TR/did-core/

Other

401 stars 94 forks source link

Service Endpoints in the DID Doc might be an anti-pattern #382

Closed msporny closed 3 years ago

msporny commented 4 years ago

TL;DR: We don't need service endpoints in the DID Document... it's an overly-complicated anti-pattern that has a lot of downsides when we already have patterns that are implemented today that would work for all use cases.

It has been asserted that Service Endpoints in the DID Document might be an anti-pattern because, at worst, they can be used to express PII in DID Documents, and in every use case that we know of to date, they can be discovered through other means that are already employed today.

Ultimately, the problem is that developers need to be educated about the dangers of placing PII in service endpoints... many won't read the spec in detail... we have over 70 DID Methods now and the number is only increasing.

What are the chances that a non-trivial subset of them implement unwisely? My guess is the chances are pretty high, and that weakens the ecosystem.

We do have an option to not give developers foot guns... and we should try very hard not to do that. I'm afraid that non-normative documentation is better than nothing, but not good enough.

Here's what the group resolved yesterday (pending 7 days for objections to the resolutions):

RESOLVED: Discuss in a non-normative appendix how one might model Service Endpoints that preserve privacy.

RESOLVED: Define an abstract data model for serviceEndpoints in normative text, like we have done with verification methods.

RESOLVED: Define how you do service endpoint extensions using the DID Spec Registry.

I wish we would do more than that... there are alternatives that the group should consider in order to discover service endpoints:

Go to an entity's website, which would have a DID Auth button, which you could then use to send them your service endpoints privately using VCs.
Find an entity like we do today -- using a search engine of some kind... schema.org markup can be used to express public endpoints using VCs.

Both of those solutions allow us to 1) Use what we already have today, and 2) address all of the use cases that we know of.

SmithSamuelM commented 3 years ago

@jandrieu But I do agree that given resolver meta data provides proof of control authority, a did doc could be replaced by a verifiable credential to provide the same information.

jandrieu commented 3 years ago

The assumption isn't that it is discoverable, but that it is resolvable (either directly from the DID itself, from a registry, or directly from a peer).

I would argue a different distinction than your comment suggests.

The DID Document SHOULD only be that which provides proof of control authority. And the current, authoritative DID Document is only attainable through the means defined in the DID Method.

Any other transmittal of a DID Document is, by definition, non-authoritative.

SmithSamuelM commented 3 years ago

@dhh1128 @jandrieu I think Daniel Hardman should respond as I believe your proposed rule would be a problem for did:peer

jandrieu commented 3 years ago

As long as did:peer provides a way to definitively get the authoritative DID Document from the peer, we're good. Because did:peer is useless outside that peer relationship, whatever the current document, as provide by the peer, is, I believe, definitive.

Of course, that begs the question about caching... but at least did:peer avoids the complication of different DID Documents existing that could each be "definitive" because whatever the most recent DID Document communicated is definitive and only has validity in the context with that peer. DID Documents for that DID given to a different peer are not in the current context, so there is no conflict.

agropper commented 3 years ago

@dlongley and @jandrieu - I would hope to avoid using VCs at this level, it just seems too heavy as compared to zCap or GNAP but I could be wrong.

The sequence diagram is a huge help to my understanding what's going on as we cross from authentication with did:key to authorization in a specific use-case. Could you please fill out the flow steps using the zCap consistent with the precedent authorization steps?

Thank you

On Mon, Sep 21, 2020 at 2:48 PM Joe Andrieu notifications@github.com wrote:

As long as did:peer provides a way to definitively get the authoritative DID Document from the peer, we're good. Because did:peer is useless outside that peer relationship, whatever the current document, as provide by the peer, is, I believe, definitive.

Of course, that begs the question about caching... but at least did:peer avoids the complication of different DID Documents existing that could each be "definitive" because whatever the most recent DID Document communicated is definitive and only has validity in the context with that peer. DID Documents for that DID given to a different peer are not in the current context, so there is no conflict.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-696301366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YLLE5OXLPT7CPXL463SG6NXLANCNFSM4QOKAB3A .

jandrieu commented 3 years ago

@gropper I looked at your sequence diagrams, but I'm not understanding why you have an agent in the loop. Or, for that matter, what any of this has to do with a DID Document and service endpoints as did:keys don't have service endpoints and the DID Document isn't present in your flow.

That said, I put together a flow for how I would do what I think you are asking... with the exception of the Agent and the last interchange with the car key as that doesn't seem to have any bearing on the payment authorization.

Simply get a zCap from the bank and delegate it to the rental agency with an appropriate caveat.

No need to register anything with the bank other than the initial DID for issuing the zCap from the bank.

There is a LOT more complexity that one could expect in this kind of scenario: for example, the rental agency probably needs a way to verify that the capability, in fact, is what Alice says it is before giving her the rental. I don't believe there is yet a standard way to do that.

There are also tons of ways you could integrate an agent, but I'm not sure that the agent is doing in this case. The delegated zCap gives the rental agency everything they need to retrieve payment. What is the agent doing in the process?

Arguably, an even better way to achieve this use case would be to use a lightning channel with a $100 in it that the rental agency can close out when Alice is done with the car. That would need neither an agent nor a bank, but it would be dependent on bitcoin.

However, if you want to delegate banking operations AND your bank is willing to use a delegatable zCap for that, then there is no need for Alice's agent (or anyone else) to get involved. Alice just uses her keys (and wallet software) to delegate the zCap from the bank with appropriate caveats.

I'm also not seeing how any of this applies to service endpoints.

With one exception, if you imagine a payment/invoice service endpoint that the rental agency might use, then that idea is understandable, but leaves out how you authorize the rental agency to use that endpoint for a specific amount. Or the agent, for that matter. Are you imagining that the agent has full authority to authorize any transaction for you? This seems like an unnecessary risk.

zCaps fundamentally include the invocation target, defined at the point of issuance. So, from one perspective, the service endpoint is embedded in the zCap. No need at all to publicly list some sort of invoicing service endpoint.

agropper commented 3 years ago

@jandrieu Thank you for engaging and for framing the discussion in terms of zCaps.

If I understand your proposed flow, Alice uses two did:key to interact with the Bank, did:key:1...3 to authenticate and did:key:xyz to control zCap (A). Alice uses some other DID to authenticate to Bob's Rent a Car and Bob's creates did:key:def in order to control their ability to cash in zCap (A) at the Bank.

In terms of the general Alice to Bob authorization use-case, Bob as Requesting Party is approaching Alice with a triplet:

Purpose: to rent car X
Bob's credentials: I'm a Fortune 500 company
Data Request: a $100 capability at a reputable bank or a secure data store with $100 cash

In the general Alice to Bob case, Alice has to do a lot of work here. (1) Evaluate the purpose for the request relative to her policies. (2) Verify Bob's credentials and compare to her policies, and (3) Authenticate and get a $100 voucher from her bank.

The reason for the agent in the loop is to keep Alice self-sovereign. Self-sovereign identity is only the beginning. Today's sad reality is a vast asymmetry of power between the individual and service providers, platforms, and other data brokers. We live in an "attention economy" where almost all of the technology is controlled by others and used to manipulate us. I tend to look at complex and important systems from a medical perspective. When faced with illness (or a legal dispute) we don't assume that the patient (or the defendant) will face the institutions directly. We introduce an agent (doctor, defense lawyer) chosen by the patient, and with expertise they can put at the patient's disposal in a fiduciary capacity. Alice needs technology she chooses to deal with the Bank and with Bobs.

The sequence diagram https://www.websequencediagrams.com/?lz=dGl0bGUgU2VwYXJhdGluZyBBdXRoZW50aWNhdGlvbiBmcm9tAA8Fb3JpegAOBQoKcGFydGljaXBhbnQgQWxpY2UgYXMgQQAKDUJhbmsAAw5vYidzIFJlbnRcbmEgQ2FyIGFzIEIAORInc1xuQWdlbnQATwVBCgoKAF0FLT5CYW5rOiBSZWdpc3RlcnMgZGlkOmtleToxLi4uMwpub3RlIG92ZXIAgQsGLAB8BTogTGF0ZXIsAIEeB2dldHMgYW4gYWdlbnQASQ5TaWduLWluIGEASBBCYW5rLT4AgWAFOiBjaGFsbGVuZ2UAgQYOc2lnbmVkIHdpdGgAgQcPAIEmFiBteQB-BiBhc1xuaHR0cHM6Ly9hbGljZS5leGFtcGxlLmNvbQoAgUITAIFDD3JlbnRzIGEgY2FyAIIeCQCCEBQyLi4udyBmb3IgYXV0aCduABgVAHAZADQFcGF5bWVudApCLT5BQTogcGF5ICQxMDAKQUEAbwVvayB0bwAMCkIAgygIAAgKAIJEBUI6ABcJQUE6IGNhciBrZXkgY2FwYWJpbGl0eQBLBQCCaAdrZXkgZm9yd2FyZGVkIHRvAIRlB3ZpYSBlbWFpbAo&s=default is a simplification of the Alice Rents a Car use-case that I'm hoping to add to our Use Case document. See https://github.com/w3c/did-use-cases/issues/101 It's an attempt to understand interoperability in human terms.

My take-away at this point is that zCaps could work in the general Alice-to-Bob authorization case even if Alice has an agent but that we might prefer GNAP because it will promote human-centered interoperability. Either way, with or without an agent and with zCaps or GNAP, authorization does not seem to require a service endpoint in an authentication DID. How'm I doing?

On Mon, Sep 21, 2020 at 7:18 PM Joe Andrieu notifications@github.com wrote:

@Gropper https://github.com/Gropper I looked at your sequence diagrams, but I'm not understanding why you have an agent in the loop. Or, for that matter, what any of this has to do with a DID Document and service endpoints as did:keys don't have service endpoints and the DID Document isn't present in your flow.

That said, I put together a flow https://www.websequencediagrams.com/cgi-bin/cdraw?lz=dGl0bGUgU2VwYXJhdGluZyBBdXRoZW50aWNhdGlvbiBmcm9tAA8Fb3JpegANBih6Q2FwcykKCnBhcnRpY2lwYW50IEFsaWNlIGFzIEEACg1CYW5rAAMOb2IncyBSZW50XG5hIENhciBhcyBCCiMAOxEnc1xuQWdlbnQAUAVBCgoKbm90ZSBvdmVyAGcGLABYBToAcwdpcyBhbHJlYWR5IGEAgTgKZWQgaW50byB0aGUgYmFuaydzIHN5c3RlbQoAgSsFLT4APQZHZXQgY2FwYWJpbGl0eSAKQmFuay0-AIFOBTogUmVxdWVzdCBESUQgZm9yIG5ldwAhDCh3LyBjaGFsbGVuZ2UpAEoORElEIChkaWQ6a2V5Onh5eikgd2l0aCBzaWduZWQALAoAYg5kZWxlZ2F0YWJsZSB6Q2FwIChBKQCBaRQ6IExhdGVyLACCdAdyZW50cyBhIGNhcgpCLT5BOgCBMAhwYXltAIJCBQCDOQwKQS0-QgCBShIAFxUAgVMQQgCCEgkAgVMNZGVmAIFIGACCZgcALggAgVsGAIFXB0EgdG8gADkLIACCHgYkMTAwIGxpbWl0IChBKwCCTgoANQpkLACCGRIiQSsiAIIWFlRoZW4AgiEKdHVybnMAgiMIAIRBBkludm9rZQCDewxmb3IAgQEGAII7BwCEEAdCOgCBGAUKCg&s=default for how I would do what I think you are asking... with the exception of the Agent and the last interchange with the car key as that doesn't seem to have any bearing on the payment authorization.

Simply get a zCap from the bank and delegate it to the rental agency with an appropriate caveat.

No need to register anything with the bank other than the initial DID for issuing the zCap from the bank.

There is a LOT more complexity that one could expect in this kind of scenario: for example, the rental agency probably needs a way to verify that the capability, in fact, is what Alice says it is before giving her the rental. I don't believe there is yet a standard way to do that.

There are also tons of ways you could integrate an agent, but I'm not sure that the agent is doing in this case. The delegated zCap gives the rental agency everything they need to retrieve payment. What is the agent doing in the process?

Arguably, an even better way to achieve this use case would be to use a lightning channel with a $100 in it that the rental agency can close out when Alice is done with the car. That would need neither an agent nor a bank, but it would be dependent on bitcoin.

However, if you want to delegate banking operations AND your bank is willing to use a delegatable zCap for that, then there is no need for Alice's agent (or anyone else) to get involved. Alice just uses her keys (and wallet software) to delegate the zCap from the bank with appropriate caveats.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-696429680, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YI2XVB5FUQFHGJQX3LSG7NKXANCNFSM4QOKAB3A .

dhh1128 commented 3 years ago

@dlongley

You seem to be suggesting that if some information X is not atomically bound with a particular key state via the DID Document then there is an insurmountable system security problem. I've intentionally called this information X here instead of "service endpoint" to highlight that what you're arguing is that all Xs must be in the DID Document.

No. "Service endpoint" is not a stand-in for "any kind of data about the DID" in my thinking. I mean exactly and only service endpoints, not a generic x. Characterizing it generically turns my assertion into a straw man. Knowing where to talk to a DID controller (inbound: the endpoint) and knowing how to authenticate the DID controller (outbound: the keys) are the two pieces of info that I claim must have mutual integrity to ensure security. Not other stuff. The reason I am concerned is because I believe the lack of synchronization between these two particular pieces of data can be exploited by creating race conditions in a way that is not risky for other data.

To understand why, consider this thought experiment.

Alice and Bob are acquaintances who wish to carry on conversations with high security. They exchange phone numbers (endpoints) and they also agree on passwords (keys) that will authenticate one to the other. Important to the thought experiment, Alice and Bob might also interact in other ways besides official endpoints (e.g., Alice could meet Bob at a conference and hand him a love letter, encrypting it with her password). The security guarantees and sequencing in communication mediums other than phone calls is undefined -- sometimes it may be good and fast, other times not. These other communication mediums represent Manu's posited alternative methods for communicating endpoints. They are also implied by Joe A's comment that if we want to communicate an endpoint, we just give the other party a VC. They are one of the ways, besides phone calls, that this VC could be shared. We don't know anything about them or their security and timing properties except that they exist and are not the same as the endpoints. (Such channels will always exist; it's impossible to design a system that prevents them.)

So Alice wants to change her phone number from A.endpoint[1] to A.endpoint[2]. Great. She calls Bob at B.endpoint[1] and gives him her existing password, A.pass[1]. Now that he knows it's her, she says, "I'd like you to call me on a new number, A.endpoint[2]". Everything is great. No ambiguity, no security problem. This represents the simple model that Manu and Joe are advocating. (BTW, notice that it doesn't use the alternative communication channel. That's why it's so clean and safe -- and so unsatisfying to me.)

The problem is, reality is messier than that. What if Alice has multiple devices, and so does Bob? What if each has multiple endpoints and multiple keys (as most orgs do)? And what if Alice and Bob are software, not human beings, and they're carrying on multiple conversations in parallel, at a mixture of machine and human speeds, at the same time?

Now you can have race conditions:

Conversation 1, step 6

Alice and Bob are in the middle of negotiating a mortgage that began when Bob reached out to A.endpoint[1]. The negotiation is driven over http endpoints by software, at machine speed. Bob is waiting at B.endpoint[1] for Alice's message where she commits to pay him back $1M over the course of 30 years.

Conversation 2, step 1

Alice emails Bob to switch to A.endpoint[2], signing the message with A.key[1].

Conversation 3, step 1

Alice changes her DID doc such that A.key[1] is replaced by A.key[2]. Suppose she writes this change to a ledger that typically has 10-60 seconds of global latency.

Conversation 1, step 7

Alice sends to Bob a nonrepudiable commitment, signed by A.key[1], that she will pay the mortgage back.

Can you see the problem? Bob could choose to believe the mortgage commitment is valid (imputing an order where Conversation 1, step 7 precedes Conversation 3, step 1). Depending on latency of the ledger and Bob's ledger cache, this might be quite rational. If he's worried about the sequencing, he could contact Alice to confirm -- but does he do that at A.endpoint[1] or A.endpoint[2]? That depends on the relative order he imagines for Conversation 1, step 7 and Conversation 2, step 1 (and maybe the relative order between 2.1 and 3.1, too). And note how easy it is to make this problem worse if Bob and Alice each have multiple endpoints and multiple keys rather than an assumed single state apiece. And don't even get me started on N-party conversations... (Don't tell me that all Bobs and Alices will centralize so they've reconciled all internal views of their own and everybody else's state. I'm not opposed to someone doing that, but I'm opposed to an imagined universe that requires that centralization. We have to allow the decentralization or what's the point of DIDs?)

Now, you can say, "It doesn't matter. Such decisions are out of scope for the spec. Bob will make whatever decisions he wants to make, and either decide he's satisfied that the mortgage is valid, or it's not. Nobody else cares, and the spec shouldn't, either." But I disagree. Can you see how a malicious attacker that can't see or alter the plaintext of any of these messages can still influence Bob's interpretation of reality by delaying or dropping some messages, or by monkeying with Bob's cache timing, and how Alice could deny a reality Bob believes in? (Alice to the judge: "No, judge. I rotated my key precisely because I was worried that a hacker who had co-opted my key [and, optionally, take your pick: endpoint[1] or endpoint[2]] would agree to that mortgage. And I told Bob so by updating my DID doc on the global ledger.") The attacker can't forge a signature, but (s)he can certainly cause messages to be seen in a different order. At the very least, this can be used for denial of service or faked misbehavior, and depending on message content, the stakes could be higher ("launch missile A", "launch missile B", "belay that order"... WHICH order?). For that matter, Alice herself could be malicious and influence Bob's interpretation. A mortgage is something that needs to be litigatable in a court of law, and if Bob's basis for accepting Alice's commitment is indeterminate, we've built a system on a foundation of shifting sand.

As long as we keep explaining our use cases with simplistic assumptions about a single conversation between two humans, at human speed, with no nefarious actors, we will continue to come to the erroneous conclusion that communicating service endpoints out of band to key changes is fine. But the only way I know to resolve this problem when we face the true complexity is to force ambiguity out of the relative order of changes between service endpoints and keys. This doesn't drive all ambiguity out of the system -- message order is still a bit unpredictable -- but it's no longer possible to play games based on different lines of control for the endpoint and the keys. (Those different lines of control are the essence of the problem: we use keys to control the state of a DID doc, but if we use something else to communicate about endpoints -- even if we sign our communications with keys -- we've created a split brain scenario that's exploitable.)

Look at how the interpretation of the parallel conversations changes if service endpoints are in the DID doc. Conversation 2 (changed endpoint) and Conversation 3 (changed key) are no longer independent, because they share the same source of truth. A malicious attacker (or decentralized chaos) can still wreak havoc with order of delivery between the mortgage conversation and the DID doc update. But now a well designed protocol can require Alice to sign not just a commitment to pay the mortgage, but a hash of the DID doc at the time of the signing. That hash now includes a service endpoint. And Bob can contact Alice at that service endpoint to get a confirmation, and Alice can't deny that she agreed to pay the mortgage back. There's no split brained Alice.

dhh1128 commented 3 years ago

@agropper said:

@dlongley and @jandrieu - I would hope to avoid using VCs at this level, it just seems too heavy as compared to zCap or GNAP but I could be wrong.

I believe this is similar to my other basic concern, which is that communicating a service endpoint by VC basically makes DIDs dependent on VCs, which in turn depend on DIDs. It's a circular dependency.

@dlongley replied by saying that VCs don't depend on DIDs. This is technically true but practically false. VCs allow any kind of URI as the identifier for a DID subject and the identifier for an issuer -- but I don't believe we have an abundance of production stacks where these identifiers are anything other than DIDs.

I assert that DIDs are a (much) lower level construct than VCs. Communicating the one piece of metadata about DIDs that is likely to be ubiquitous -- how to talk to the controller -- using VCs doesn't make DID docs totally useless on their own. But it means that any meaningful impl of DIDs must also have support for VC validation. That's like putting hostnames rather than naked IP addresses in an IP packet header, making IP depend on DNS. It's a BAD idea from a software architecture perspective.

I actually agree with the general sentiment behind Joe's and Manu's and Dave's comment -- that DID docs should be as simple as possible. I am fine taking out lots of things. Indeed, this is why I have never believed in JSON-LD-style extensibility for them. However, if taking out service endpoints introduces race conditions or obnoxious dependencies, we've gone too far.

agropper commented 3 years ago

I find @dhh1128 arguments for having essential service endpoints in the DID Doc convincing.

I'm reminded of how we write contracts in general, not just mortgages. The contract binds together the non-repudiable identities of the parties (typically through a notary), the points of notification, the terms, and the jurisdiction where the contract will be interpreted in case of dispute. All four of these components are necessary.

In our DID case:

Alice is identified by a DID and the associated non-repudiable signature
the jurisdiction is in the method and resolution
the point of notification to Alice is a service endpoint
the terms are delegated by Alice to an authorization server via a service endpoint

The question then becomes: Can we collapse Alice's notification endpoint and her authorization endpoint into a single endpoint and if so, is that a good idea?

On Tue, Sep 22, 2020 at 7:18 PM Daniel Hardman notifications@github.com wrote:

@agropper https://github.com/agropper said:

@dlongley https://github.com/dlongley and @jandrieu https://github.com/jandrieu - I would hope to avoid using VCs at this level, it just seems too heavy as compared to zCap or GNAP but I could be wrong.

I believe this is similar to my other basic concern, which is that communicating a service endpoint by VC basically makes DIDs dependent on VCs, which in turn depend on DIDs. It's a circular dependency.

@dlongley https://github.com/dlongley replied by saying that VCs don't depend on DIDs. This is technically true but practically false. VCs allow any kind of URI as the identifier for a DID subject and the identifier for an issuer -- but I don't believe we have an abundance of production stacks where these identifiers are anything other than DIDs.

I assert that DIDs are a (much) lower level construct than VCs. Communicating the one piece of metadata about DIDs that is likely to be ubiquitous -- how to talk to the controller -- using VCs doesn't make DID docs totally useless on their own. But it means that any meaningful impl of DIDs must also have support for VC validation. That's like putting hostnames rather than naked IP addresses in an IP packet header, making IP depend on DNS. It's a BAD idea from a software architecture perspective.

I actually agree with the general sentiment behind Joe's and Manu's and Dave's comment -- that DID docs should be as simple as possible. I am fine taking out lots of things. Indeed, this is why I have never believed in JSON-LD-style extensibility for them. However, if taking out service endpoints introduces race conditions or obnoxious dependencies, we've gone too far.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-697031873, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YOB7KFMTMZHPQADV3TSHEWELANCNFSM4QOKAB3A .

csuwildcat commented 3 years ago

I have found none of the comments arguing for removal of service endpoints compelling or sufficient to address the valid use cases that require them. While the conversation is interesting, what, at this point, is the intended outcome of this Issue in the absence of anything approaching consensus for removal of this feature?

talltree commented 3 years ago

I completely agree with @csuwildcat. I was amazed to see this thread had grown so long I had to invoke a search to find the comment I posted 21 days ago.

I was even more surprised to find that, after 3 more weeks of discussion, every point I made in that comment remains: a) true (it has six thumbs up), and b) unaddressed by any subsequent discussion.

Folks, we have a spec to finish. At the very start of the special topic call on service endpoints this Thursday (noon ET), I am going to make the following proposal (originally made by @OR13):

PROPOSAL: In DID Core we shall define an abstract data model for service endpoints the same way we have for verification methods. In that section we shall include a special warning about privacy considerations. In the Privacy Considerations section we shall include a more extensive warning. Lastly, in the Implementation Guide we shall also cover this topic in depth.

agropper commented 3 years ago

The scope of the abstract data model for a verification method allows a controller to:

authenticate somewhere
sign something
rotate or recover cryptographic materials
assert a service endpoint(s)

What will be the scope for the abstract data model we’re defining for a service endpoint?

On Wed, Sep 23, 2020 at 3:06 AM Drummond Reed notifications@github.com wrote:

I completely agree with @csuwildcat https://github.com/csuwildcat. I was amazed to see this thread had grown so long I had to invoke a search to find the comment I posted 21 days ago https://github.com/w3c/did-core/issues/382#issuecomment-685283132.

I was even more surprised to find that, after 3 more weeks of discussion, every point I made in that comment remains: a) true (it has six thumbs up), and b) unaddressed by any subsequent discussion.

Folks, we have a spec to finish. At the very start of the special topic call on service endpoints this Thursday (noon ET), I am going to make the following proposal (originally made https://github.com/w3c/did-core/issues/382#issuecomment-684905119 by @OR13 https://github.com/OR13):

PROPOSAL: In DID Core we shall define an abstract data model for service endpoints the same way we have for verification methods. In that section we shall include a special warning about privacy considerations. In the Privacy Considerations section we shall include a more extensive warning. Lastly, in the Implementation Guide we shall also cover this topic in depth.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-697176633, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YJDTFF6T7U6VIUOGK3SHGM5PANCNFSM4QOKAB3A .

msporny commented 3 years ago

PROPOSAL: In DID Core we shall define an abstract data model for service endpoints the same way we have for verification methods. In that section we shall include a special warning about privacy considerations. In the Privacy Considerations section we shall include a more extensive warning. Lastly, in the Implementation Guide we shall also cover this topic in depth.

You don't need to make that proposal... we already have that proposal agreed to via the initial resolutions we've made:

RESOLVED: Discuss in a non-normative appendix how one might model Service Endpoints that preserve privacy. RESOLVED: Define an abstract data model for serviceEndpoints in normative text, like we have done with verification methods. RESOLVED: Define how you do service endpoint extensions using the DID Spec Registry.

It's a no-op, and it's not the point of contention.

I'll also point out that the following sorts of statements are not helpful for those arguing for service endpoints in DID Documents:

@csuwildcat wrote:

I have found none of the comments arguing for removal of service endpoints compelling or sufficient to address the valid use cases that require them. While the conversation is interesting, what, at this point, is the intended outcome of this Issue in the absence of anything approaching consensus for removal of this feature?

To be crystal clear -- the point of contention is if service endpoints should be in DID Documents at all... if we lack consensus around that, the default is to remove the feature... not keep it.

Let me repeat that, because it seems like folks are missing the point wrt. consensus: If we can't agree on a feature being a good idea -- it gets removed. This is true for all features in the specification. Yes, formal objections can be overruled, but we really don't want to go down that road.

Also, to be clear, I'm not arguing for the removal of service endpoints (I'm pointing out that others might)... I'd just rather we specify at least one that we think would work well for GDPR and CCPA.

For example (I'm using bad but descriptive names on purpose -- we can bikeshed those later):

   "service": {
      "id": "#seeAlsoService",
      "type": "PrivacyProtectingCredentialService",
      "serviceEndpoint": "https://example.com/"
  }

The protocol for the PrivacyProtectingCredentialService allows one to tack the DID on to the serviceEndpoint and get more information related to the DID, expressed as Verifiable Credentials. So, doing this:

GET https://example.com/dids/did:example:123abc

will give you all the public VCs associated with did:example:123abc (including self-issued ones). The endpoint will only allow POSTS from did:example:123abc (authz'd via DIDAuth, or similar). The endpoint has to only be a domain (to ensure that PII isn't written to the ledger). DID Methods may further enforce restrictions (like, only a handful of PrivacyProtectingCredentialService domains known to pass privacy tests are allowed by a particular DID Method).

Defining this would enable a mechanism compliant with GDPR/CCPA today and provide a concrete example of the type of design that would pass muster from a privacy perspective. If you don't like this PrivacyProtectingCredentialService, you don't have to use it... you can use something else that you feel is better.

So, the counter-proposal that could allow us to keep service and close this issue could be:

PROPOSAL: Define a privacy-protecting service endpoint in the Service section of the DID Core specification.

That would allow us to start work on a PR that would address this issue and work out the details in that PR.

OR13 commented 3 years ago

@msporny are you proposing we provide documentation for this term https://github.com/w3c/did-spec-registries/blob/master/contexts/did-v1.jsonld#L19

is the documentation essentially?:

"Use one or more proxies which strip HTTP headers, timing data, any fingerprinting vectors and provide denial of service mitigation in front of any service exposed via the did document".

"Recommend having a single service endpoint which is used to grant access to additional services in an automatic manner"

"Recommend not adding lots of services to a did document, similar to browser extensions, multiple services will be used to correlate users and attack their privacy".

"Recommend not exposing any services that do not have DNS privacy protection, or that expose an IP address in proximity to the DID subject".

"Recommend TOR....".... etc

msporny commented 3 years ago

@msporny are you proposing we provide documentation for this term https://github.com/w3c/did-spec-registries/blob/master/contexts/did-v1.jsonld#L19

Yes, or something like it. That term is a placeholder for this discussion.

is the documentation essentially? ...

Yes, that is one set of rules we could suggest. We'd work which ones to include in the PR, probably starting with a small set and seeing how restrictive we can get. Just my $0.02 below:

"Use one or more proxies which strip HTTP headers, timing data, any fingerprinting vectors and provide denial of service mitigation in front of any service exposed via the did document".

Yep... as a SHOULD/RECOMMENDED.

"Recommend having a single service endpoint which is used to grant access to additional services in an automatic manner"

I'm not quite sure what you mean by this one... need more information because I can express an opinion on it.

"Recommend not adding lots of services to a did document, similar to browser extensions, multiple services will be used to correlate users and attack their privacy".

Yep.

"Recommend not exposing any services that do not have DNS privacy protection, or that expose an IP address in proximity to the DID subject".

Yep.

"Recommend TOR....".... etc

Yep.

Again, the smaller the SHOULD set we start with, the easier it'll be to get to something better than what we have today (which is effectively no example on what a good, privacy-preserving service endpoint looks like).

I'll also note that much of this doesn't apply to peer-wise DID Methods or other non-DLT methods.

csuwildcat commented 3 years ago

If we can't agree on a feature being a good idea -- it gets removed.

If you're asserting a feature that was added via consensus years ago, is used by implementers (in production-level implementations), and is relied on by other specs, gets removed by default because a subset of folks take issue with how it could be used in a subset of cases literally years afterward, I just don't agree with that on any level.

I was also under the impression we had decided on a past call that this wasn't a question of whether SEs were in or out, but how we message certain privacy considerations. It's ridiculous to me that the assertion now appears to be that those were all basically tangential proposals with no bearing or implication on the retention of SEs.

msporny commented 3 years ago

I just don't agree with that on any level.

Then you disagree with the W3C Process, which is fine, but I can't do anything about that.

Please focus on putting forward concrete proposals that can achieve consensus. I've put one forward in https://github.com/w3c/did-core/issues/382#issuecomment-697362108 -- if you'd be ok with that, we can resolve this issue and move on.

csuwildcat commented 3 years ago

I cannot support the addition of ANY specific Service Endpoints in the spec that cast legal/political shadows on others by inference. We should not codify ANY techno-legal formulations of specific features and their applied external uses/impacts. This very bad idea I believe falls outside the bounds of what we should be doing as a technical specification body.

msporny commented 3 years ago

I cannot support the addition of ANY specific Service Endpoints in the spec that cast legal/political shadows on others by inference.

Please put forward a proposal that will achieve consensus.

csuwildcat commented 3 years ago

Proposal: Along with codification of the general data model/format of Service Endpoints, add to an appendix or usage guide the considerations one should take into account when constructing Service Endpoint entries that must account for varying levels of privacy, as illustrated through examples.

^ To clarify what I mean in the last part of the text above: some uses of Service Endpoints do not require involve the same privacy considerations others do. An example of that would be a company with a DID that wants to note the Web domain origins it also controls. Other cases, where a DID owner is trying to remain as anon as possible, you may face an entirely different set of considerations, for which we can provide examples of what to do/not to do.

msporny commented 3 years ago

Proposal: Along with codification of the general data model/format of Service Endpoints, add to an appendix or usage guide the considerations one should take into account when constructing Service Endpoint entries that must account for varying levels of privacy, as illustrated through examples.

To break that proposal down:

Along with codification of the general data model/format of Service Endpoints

We resolved to do this a few weeks ago -- https://github.com/w3c/did-core/issues/382#issue-688206221

add to an appendix or usage guide the considerations one should take into account when constructing Service Endpoint entries that must account for varying levels of privacy,

We resolved to do this a few weeks ago -- https://github.com/w3c/did-core/issues/382#issue-688206221

as illustrated through examples.

It's arguable that we've resolved to already do this as well, but if we assume we didn't yet agree to do that: I'll take the only remaining part of your proposal that we haven't already agreed to as "PROPOSAL: Add concrete examples to demonstrate how service endpoints might account for varying levels of privacy requirements."

Is that aligned with your mental model?

agropper commented 3 years ago

It would be best if an abstract data model for service endpoints did not dilute the privacy features of DIDs.

So, I ask the experts again https://github.com/w3c/did-core/issues/382#issuecomment-697283182.

dlongley commented 3 years ago

@dhh1128,

Can you see the problem? Bob could choose to believe the mortgage commitment is valid (imputing an order where Conversation 1, step 7 precedes Conversation 3, step 1).

...

Can you see how a malicious attacker that can't see or alter the plaintext of any of these messages can still influence Bob's interpretation of reality by delaying or dropping some messages, or by monkeying with Bob's cache timing, and how Alice could deny a reality Bob believes in? (Alice to the judge: "No, judge. I rotated my key precisely because I was worried that a hacker who had co-opted my key [and, optionally, take your pick: endpoint[1] or endpoint[2]] would agree to that mortgage. And I told Bob so by updating my DID doc on the global ledger.") The attacker can't forge a signature, but (s)he can certainly cause messages to be seen in a different order. At the very least, this can be used for denial of service or faked misbehavior, and depending on message content, the stakes could be higher ("launch missile A", "launch missile B", "belay that order"... WHICH order?). For that matter, Alice herself could be malicious and influence Bob's interpretation. A mortgage is something that needs to be litigatable in a court of law, and if Bob's basis for accepting Alice's commitment is indeterminate, we've built a system on a foundation of shifting sand.

...

A malicious attacker (or decentralized chaos) can still wreak havoc with order of delivery between the mortgage conversation and the DID doc update. But now a well designed protocol can require Alice to sign not just a commitment to pay the mortgage, but a hash of the DID doc at the time of the signing. That hash now includes a service endpoint. And Bob can contact Alice at that service endpoint to get a confirmation, and Alice can't deny that she agreed to pay the mortgage back. There's no split brained Alice.

I think there are multiple threats being conflated above. One is from an attacker that can only prevent, delay, or reorder messages. Another is from an attacker that can sign with Alice's keys. These are very different threats. It's hard to analyze the entire scenario as written because it doesn't get down to the primitives and keep the threats clear. Either Alice signed that mortgage commitment with A.key[1] or she didn't -- either Bob has to consider it could have been Carol instead or not.

TL; DR: In the given scenario, someone signed that mortgage commitment and Bob always has to consider that it could have been Carol, within some risk profile, not Alice. If Alice signed it, then she intended to make the commitment and there's no problem.

There should be no way for an attacker that is limited to preventing, delaying, or reordering messages to cause Bob to think Alice agreed to a contract that she didn't agree to. I, of course, agree with this. Here we are assuming that Alice's keys have not been compromised but that she can also change those keys at will. The idea that the attacker here could make Alice agree to a contract she didn't want is a problem with the semantics of the messages she signs, not their order. It doesn't matter where the messages come from -- as the trust is in the key used to sign and that it is under Alice's control -- which it is here, by assumption.

If Bob trusts the key in the DID Document, then he trusts whatever it signs as authentically coming from Alice. If signing an endpoint is important, include that in what gets signed. It doesn't have to be in the DID Document to do that. This trust is rooted in the use of the key, not in whatever additional data is present in the DID Document. If it were any different, then it would be as I said above, any X that Bob wants to trust as being bound to a key must be put into the DID Document. If Alice can simply say "Well, that endpoint wasn't in my DID Document at the same time as the key I signed with" then what about the other things she signed? The contract said she agreed to a 30 year term but that wasn't in her DID Document at the time, it was only in the signed contract, so it doesn't count?

No, you may say -- this doesn't apply because it isn't about the communication channel. I think that's an arbitrary distinction. It's getting inserted in here because we're considering an attacker that can only prevent, delay, or reorder her messages -- but a protocol should not allow that to change the semantics of her messages. Either Alice signed that mortgage commitment or she didn't. This is a protocol and data modeling problem. It is not solved by binding an endpoint to the DID Document. Remember, we're assuming Alice's keys are not compromised in this case. We can introduce that next so that Bob may not know whether or not Alice is Alice.

So the other threat here is from actual key compromise. No amount of binding will solve the problem here. Because this problem cuts at the core assumption for trust in the system (the security/control of the key). Everything rests on that, so throwing out the assumption that the key is not compromised creates an unsolvable problem.

Suppose Alice's keys and endpoint were compromised before the "DID Document hash" was signed. Carol is actually the one that signed that hash and responded on the same service endpoint in the affirmative. This all happens before Alice notices. Carol completes everything with Bob, fully impersonating her. Only then Alice updates her DID Document with a new key and service endpoint because she suspects there could have been a breach. Alice will later deny that she agreed to pay the mortgage back. Whether or not the service endpoint was in the DID Document is irrelevant. There are other variants here as well. You can just keep shuffling these events around and creating problems not because the service endpoint is or isn't in the DID Document, or any other X for that matter, but because the core guarantee of the integrity of the system got its knees knocked out -- the key was compromised. Alice can claim that at any time -- a judgement would need to be made on other factors to determine the legitimacy of her claim.

There's no way to ensure a scenario in which Alice cannot possibly say her key was stolen -- there are no protections against it, precisely because it is an assumption in the system. Alice isn't her key -- and there's no brand of atomicity you can add to the system to make her the same as her key. Carol can use it too if she gets access to it.

So, I don't believe the sorts of problems can be avoided by simply having a "single state apiece". It also seems to ignore what would happen when the endpoint itself is a privacy-preserving router/proxy/negotiation mechanism for arriving at some other endpoint for communications. This stuff is all asynchronous and other measures should be taken to avoid confusing the semantics of certain messages to address the first threat.

In short, I don't buy this line of argumentation for requiring service endpoints in DID Documents.

talltree commented 3 years ago

It's arguable that we've resolved to already do this as well, but if we assume we didn't yet agree to do that: I'll take the only remaining part of your proposal that we haven't already agreed to as "PROPOSAL: Add concrete examples to demonstrate how service endpoints might account for varying levels of privacy requirements."

@msporny Thanks for the clarifications above. I support this proposal. If we can see if there is consensus on this proposal at the start of tomorrow's special call on this topic, then we can spend the majority of the call agreeing on what a PR (or PRs) need to include—and who is going to do them. How's that sound?

dhh1128 commented 3 years ago

Okay, Dave. This is fascinating. I think that another assumption might be getting in our way. Let me pare back all the cruft and see if I can expose it. And I apologize for adding another tome to this incredibly long thread, but this is important stuff to understand, even if we retain diverged perspectives when it's done.

In my worldview:

An endpoint isn't guaranteed to provide duplex (two-way) communication.

This is quite different from the mental model in classic web services. You call an endpoint, and the server communicates back to you over the same socket. Messages/payloads/data can flow either direction (either as a request from the caller, or as a response from the server).

I want to support endpoints that are one-way. Imagine an endpoint that's a sink for a message queue, for example: amqp://192.168.4.25:2948. Such an endpoint is a sink where listening occurs, but it's not a way to talk back. Talking and listening are two different activities, sometimes but not always combined. I think endpoints MUST be a place to listen but only MAY be a place to talk back. (Simplex support is required for highly asynchronous HTTP, for partial tunnels and mixed-transport usage, for some onion routing, for passive modes, and for lots of transports that aren't HTTP.)

Anyway, the reason this is relevant is that it seems to me (please correct me if I'm wrong) that commenters on this thread may be assuming different things about the relative capabilities of participants in the system. You, Joe, Manu, and maybe others seem to be assuming that if Alice can say something to Bob, then she can also receive Bob's responses, and vice versa.

On the other hand, I'm assuming that if Alice can say something to Bob, that doesn't imply anything about whether Bob can say something back. The thing that turns a simplex channel into a duplex channel is having the service endpoint of the other party, not the fact that the active endpoint's URI begins with "https". Alice can say something to Bob over Bob.endpoint; Bob is only guaranteed to be able to say something back to Alice if he knows Alice.endpoint.

In such a model, control over keys is only helpful when saying things (outbound). It doesn't help when listening (inbound). Yet a DID controller needs to exercise control over how they listen IN ADDITION to how they speak. And -- THIS IS THE CRUCIAL POINT -- it's really problematic to exercise control over how you listen by controlling how you speak, if communication isn't guaranteed to be duplex.

Think about this in the physical world. You control how you speak by impulses to your vocal cords. How do you control what you hear? Well, you could also do that via vocal cords: "Alexa, play 'Some Kind of Blue' by Miles Davis." This is the analog to publishing a VC (speaking) to change your endpoint (listening). But this doesn't always work in the physical world. If the environment is noisy, or you don't speak the right language, or you have an uncooperative guitar soloist in the next room, what do you do? You exert control in a different way, by clapping your hands over your ears, getting headphones, unplugging the amp, or taking a walk.

It doesn't always work in the virtual world, either. How do you deliver a VC announcing your new endpoint, when the only channel that exists is incoming simplex? Answer: use another channel that's simplex the other way. But...

Often, it's desirable to coordinate the timing or sequencing of talking and listening. Hard to do when there's no mutexing mechanism between the two channels. My extended scenario above tried to explain why NOT coordinating could have security consequences in the context of DID docs. You are right that I'm combining multiple threats, but I'm not conflating them. Most exploits take advantage of how multiple weaknesses come together to expose a composite vulnerability. Just because they're not individually dangerous does not mean they're harmless in the aggregate.

You say that the entire system we've built rests on the assumption that keys are not compromised. Which of the following do you mean?

The entire system rests on the assumption that keys are not compromised BEFORE THEY ARE ROTATED.

Or the more ambitious variant:

The entire system rests on the assumption that keys are not compromised EVEN AFTER THEY ARE ROTATED.

I am assuming the former, because why would we ever rotate keys otherwise? But it seems you might be claiming the latter, because that's the only model that justifies a claim that whether Alice's key signed -- not when it signed -- is the only relevant question. If you buy the first perspective, then relative timing matters. (Yes, there's always the possibility that disputes can arise, even if you can prove relative timing. Alice can claim her child grabbed her phone and signed the mortgage. But it's now possible to constrain a signer's accountability to a time range. This is quite valuable -- again, otherwise why would we rotate keys?)

And there's the nub. How do you announce either rotation or compromise? By emitting a signed statement of some kind. In other words, you talk (vocal cords, outbound comms). How do you confirm the order of a signed statement relative to some interesting event, when inevitable complications arrive in a decentralized world? By responding to a question -- which requires you to listen. But you can't respond to a question in a determinate sequence if saying something and exercising control over how you listen are bifurcated. In a duplex channel, you get a natural ordering. But that's the very thing that gets lost if endpoints are removed from a DID doc (given my claim that you can't assume duplex for arbitrary channels that might be used).

agropper commented 3 years ago

The simplex - duplex perspective is useful. As a DID-controller, I control if and how my DID Doc will be discovered. I might send my DID to Bob or I might post my DID in a public directory along with metadata that, for instance might allow me to get targeted offers based on my Zip-code. Neither of these is a privacy problem in themselves because the intention is clear.

The next thing that happens is either Bob resolves my DID document or Bob looks up my DID in some well-known place, like Google the way we Google for UPS Tracking Numbers rather than bothering to prepend http://ups.com/ . Alice can't intentionally control this step without creating a new DID at every opportunity because Alice has absolutely no control over what metadata anyone our there attaches to any DID.

This means that the only thing Alice can do intentionally is to control a service endpoint in the DID document. This does not solve the problem of Bob or anyone else posting her DID along with some metadata but at least it decreases the incentive for people to Google DIDs because it's easier than resolving them. In may cases, I prefer to Google an identifier because I hope to find out negative or unintended associations that contribute to the reputation of the subject of that identifier.

Either way, once Bob has either resolved a DID or Googled it, Bob has an endpoint to use. Bob now has to formulate a message or a request. Alice's service endpoint bears the cost of dealing with the spam.

When the spam is too much, Alice decides to deal with her endpoint for incoming messages the way she would deal with a compromised private key - rotate it. How does Alice convince Google and the other thousand data brokers to rotate or forget the service endpoint they have on file?

My point is that it's not enough for Alice to control her DID document. She must also be able to run a very powerful spam filter, one that imposes significant costs on Bob to produce credentials, a data scope, and a purpose for any incoming communication to any endpoint that Alice might be listening to.

The default service endpoint in a DID document needs to be an authorization server or a mediator because those are the only two types that actually give Alice a prayer for filtering the spam.

msporny commented 3 years ago

Resolutions from https://www.w3.org/2019/did-wg/Meetings/Minutes/2020-09-24-did-topic

Resolution 1: The ability for a controller to optionally state at least one service endpoint in the DID Document increases their control and agency

Resolution 2: Add concrete examples to the Privacy Considerations section to demonstrate how service endpoints might account for varying levels of privacy requirements.

Resolution 3: Add privacy guidance that establishes that there is a privacy spectrum and publication strategies along that privacy spectrum of how service endpoints might be published.

Resolution 4: Add privacy guidance that discourages services from being expressed in DID Documents that are published to Verifiable Data Registries unless a DID Method specification have given specific guidance about how privacy concerns are addressed.

These resolutions help us get to closure on this issue by:

Establishing that DID Core will specify the service property.
Establishing that we will document the arguments for and against publishing service endpoints in DID Documents on VDR registries, as well as their privacy implications in DID Core.

msporny commented 3 years ago

This issue can be resolved by writing a PR that addresses the resolutions raised by the group. This issue is waiting for an editorial PR to be written, thus is a low priority to get done before CR.

tahpot commented 3 years ago

Wow, what a thread and subsequent meeting on 09/24. I learned a lot, so thanks to everyone for their great contributions and thoughtful discussion.

At risk of opening a can of worms that has seemingly been shut, I see value in introducing an optional hiddenService (or unpublishedService?) core property that defines a single service endpoint for external users to request access to a set of hidden serviceEndpoints that must be requested at a point in time.

This hiddenService endpoint would only support a sub-set of common protocols (HTTP...) and auth methods (TBD). The endpoint would be expected to return a signed DID document listing all the service endpoints visible to the requester.

This allows the spec to:

be explicit about distinguishing between visible / hidden service endpoints
provides a means to "discover" hidden service endpoints if you only have a user's DID. If the endpoint is hit with no authorization a list of public endpoints can be returned, but they are never published so the controller can remove them at any time
enable dynamic endpoints to exist, whereby a requestors credentials could determine a subset of private service endpoints to be returned (ie: I'm okay with letting Google contact me via Twitter, but not via my PornHub account). As discussed in this thread, I could alternatively provide many VC's to Google. However, imagine if I have 50 different accounts, that's a lot of data to send (you can't embed all that in an onboarding URL!), plus anytime that information changes I need to resend every serviceEndpoint to Google (and others). I'm better off providing Google with an auth token to access my hidden service endpoints, dynamically controlling which endpoints are returned by Google's auth token (or equivalent).
avoid breaking the existing services property, so the spec can clearly state that services is a list of public endpoints and should be used with extreme caution.
state that any serviceEndpoints returned from the hiddenService should not be published / indexed and doing so could put the publisher in breach of various laws due to PII information.

agropper commented 3 years ago

How would hiddenService be different from a GNAP-protected resource?

On Fri, Nov 27, 2020 at 6:40 AM tahpot notifications@github.com wrote:

Wow, what a thread and subsequent meeting on 09/24. I learned a lot, so thanks to everyone for their great contributions and thoughtful discussion.

At risk of opening a can of worms that has seemingly been shut, I see value in introducing an optional hiddenService (or unpublishedService?) core property that defines a single service endpoint for external users to request access to a set of hidden serviceEndpoints that must be requested at a point in time.

This hiddenService endpoint would only support a sub-set of common protocols (HTTP...) and auth methods (TBD). The endpoint would be expected to return a signed DID document listing all the service endpoints visible to the requester.

This allows the spec to:

be explicit about distinguishing between visible / hidden service endpoints

provides a means to "discover" hidden service endpoints if you only have a user's DID. If the endpoint is hit with no authorization a list of public endpoints can be returned, but they are never published so the controller can remove them at any time

enable dynamic endpoints to exist, whereby a requestors credentials could determine a subset of private service endpoints to be returned (ie: I'm okay with letting Google contact me via Twitter, but not via my PornHub account). As discussed in this thread, I could alternatively provide many VC's to Google. However, imagine if I have 50 different accounts, that's a lot of data to send (you can't embed all that in an onboarding URL!), plus anytime that information changes I need to resend every serviceEndpoint to Google (and others). I'm better off providing Google with an auth token to access my hidden service endpoints, dynamically controlling which endpoints are returned by Google's auth token (or equivalent).

avoid breaking the existing services property, so the spec can clearly state that services is a list of public endpoints and should be used with extreme caution.

state that any serviceEndpoints returned from the hiddenService should not be published / indexed and doing so could put the publisher in breach of various laws due to PII information.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-734816706, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YIIBPQPMHK43ICWYZDSR6M5DANCNFSM4QOKAB3A .

tahpot commented 3 years ago

As far as I understand GNAP, hiddenService could well be a GNAP-protected resource with the spec defining the structure of the returned resource.

OR13 commented 3 years ago

I don't see the value of hiddenService as a separate field... but I do see the value of a service type which accomplishes the same functionality, also, related to hidden services: https://github.com/BlockchainCommons/did-method-onion

TelegramSam commented 3 years ago

DIDComm supports the disclosure of supported protocols at the discretion of the DID owner for all the reasons stated above, without a separate hiddenService field. Given that disclosure is likely to be interactive in some way, specifying the method to do so in the spec seems limiting to the development of new and improved ways to accomplish that disclosure.

OR13 commented 3 years ago

@TelegramSam agree, I would personally like to see a service type of DIDComm or something similar... so that you can start interrogating the service directly...

jandrieu commented 3 years ago

@OR13 I thought DIDComm was transport agnostic. How would you know what transport to use? Or is there a standard http API that provides a DIDComm binding for URLs?

OR13 commented 3 years ago

@jandrieu assuming the serviceEndpoint=https://example.com/... you would know its HTTP ready. AFAIK the did core spec has no examples other than HTTP, but I assume there might be service definitions that might express type=DIDComm transport=bluetooth... DID Core would be responsible for defining services sufficiently to support transport agnosticism, IMO its not doing a great job of that today.

jandrieu commented 3 years ago

@OR13 Ok. That matches my expectation. The addition of a transport property might do the trick, but I'll leave that to DIDComm folks.

dhh1128 commented 3 years ago

assuming the serviceEndpoint=https://example.com/... you would know its HTTP ready

Yes.

Or you could have...

type=DIDComm and endpoint=mailto:inboxes@agentsrus.com OR
type=DIDComm and endpoint=kafka:kafka.agentsrus.com/DIDComm OR
type=DIDComm and endpoint=bluetooth:mydeviceid OR
type=DIDComm and endpoint=post:123+main+street+Anywhere+USA+12345 OR
type=DIDComm and endpoint=s3:bucketid OR
type=DIDComm and endpoint=tor:foo.onion/xyz ...etc

In all cases, the encryption/packaging/security guarantees are identical. The logical bytes of the messages are also identical, although they may be MIME-encoded for email or use transfer chunk encoding with HTTP POST. This is what is meant when we say that DIDComm is transport agnostic.

DIDComm runs arbitrary protocols, so you never need more than one endpoint. One of the protocols you can run is a feature discovery protocol that lets you discover what other protocols the other party supports/is willing to engage in. A hidden service endpoint is thus unnecessary; any agent gets to decide what services it wants to expose to each party that contacts it there.

OR13 commented 3 years ago

tahpot commented 3 years ago

A hidden service endpoint is thus unnecessary; any agent gets to decide what services it wants to expose to each party that contacts it there.

That makes sense, so the "feature discovery protocol" could be used via DIDComm to expose additional services.

However, the definition of those services would differ from the serviceEndpoint spec within DID Core. Is that inconsistency acceptable? Is the dependency on DIDComm for discovery of these additional services acceptable?

dhh1128 commented 3 years ago

However, the definition of those services would differ from the serviceEndpoint spec within DID Core. Is that inconsistency acceptable?

I'm not sure. Could you say more about what inconsistency you're noticing?

Is the reliance on DIDComm for discovery of these additional services acceptable?

Probably not for everyone. I wasn't arguing that everyone should adopt DIDComm; I was just explaining why, if you assume DIDComm, you don't need a solution for this additional challenge.

tahpot commented 3 years ago

Probably not for everyone. I wasn't arguing that everyone should adopt DIDComm; I was just explaining why, if you assume DIDComm, you don't need a solution for this additional challenge.

I think that's the heart of what I'm trying to say.

While it's technically possible to use DIDComm (or another type), we can't assume everyone is going to use DIDComm to communicate a "hidden serviceEndpoint".

As it seems very useful to support the concept of a hidden serviceEndpoint (or similar), I would prefer to see such capability explicitly defined in the spec.

agropper commented 3 years ago

It’s a ‘chicken and egg’ type of problem for interoperability.

Does the transport come first (to an inbox, for an authorization request)? or
Does the publication come first (to reveal a resource somewhere that you may or may not be authorized to access)?

The chicken arrow goes in one direction and the egg arrow goes in the other direction relative to the DID controller subject. That’s the “hub” of the problem.

On Thu, Dec 10, 2020 at 5:19 AM tahpot notifications@github.com wrote:

Probably not for everyone. I wasn't arguing that everyone should adopt DIDComm; I was just explaining why, if you assume DIDComm, you don't need a solution for this additional challenge.

I think that's the heart of what I'm trying to say.

While it's technically possible to use DIDComm (or another type), we can't assume everyone is going to use DIDComm to communicate a "hidden serviceEndpoint".

As it seems very useful to support the concept of a hidden serviceEndpoint (or similar), I would prefer to see such capability explicitly defined in the spec.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-742426599, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YPQHL5LOJIFA75B2ILSUCOEFANCNFSM4QOKAB3A .

msporny commented 3 years ago

I have authored PR #511 to address the resolutions the WG made here: https://github.com/w3c/did-core/issues/382#issuecomment-703119593

This issue can be closed once PR #511 is merged.

agropper commented 3 years ago

Made suggested changes to https://github.com/w3c/did-core/pull/511#pullrequestreview-556071256 that I believe are consistent with the four resolutions.

dhh1128 commented 3 years ago

I turned Adrian's latest suggestions into PR #515, which is an alternative embodiment of the resolutions made here. If accepted, this would supersede PR #511.

msporny commented 3 years ago

This issue will be closed once PR #515 is merged.

msporny commented 3 years ago

PR #515 has been merged, closing.