w3c / did-core

W3C Decentralized Identifier Specification v1.0
https://www.w3.org/TR/did-core/
Other
398 stars 93 forks source link

Service Endpoints in the DID Doc might be an anti-pattern #382

Closed msporny closed 3 years ago

msporny commented 3 years ago

TL;DR: We don't need service endpoints in the DID Document... it's an overly-complicated anti-pattern that has a lot of downsides when we already have patterns that are implemented today that would work for all use cases.

It has been asserted that Service Endpoints in the DID Document might be an anti-pattern because, at worst, they can be used to express PII in DID Documents, and in every use case that we know of to date, they can be discovered through other means that are already employed today.

Ultimately, the problem is that developers need to be educated about the dangers of placing PII in service endpoints... many won't read the spec in detail... we have over 70 DID Methods now and the number is only increasing.

What are the chances that a non-trivial subset of them implement unwisely? My guess is the chances are pretty high, and that weakens the ecosystem.

We do have an option to not give developers foot guns... and we should try very hard not to do that. I'm afraid that non-normative documentation is better than nothing, but not good enough.

Here's what the group resolved yesterday (pending 7 days for objections to the resolutions):

RESOLVED: Discuss in a non-normative appendix how one might model Service Endpoints that preserve privacy.

RESOLVED: Define an abstract data model for serviceEndpoints in normative text, like we have done with verification methods.

RESOLVED: Define how you do service endpoint extensions using the DID Spec Registry.

I wish we would do more than that... there are alternatives that the group should consider in order to discover service endpoints:

Both of those solutions allow us to 1) Use what we already have today, and 2) address all of the use cases that we know of.

csuwildcat commented 3 years ago

The URI of your personal datastore, if PII (e.g., https://mydatastore.com/csuwildcat), cannot be erased from the immutable ledger. What it exposes can be changed -- but if the ledger stores that URI and supports a historical view, there is no way to delete that identifier for you.

First off, I never said the endpoints need to include identity data or human-friendly info, so I regard the first part about csuwildcat in the URL as a strawman that I will leave to whomever was arguing for that - because it certainly wasn't me. No personal info goes on the ledger, we all agree on this, and I really hope we don't have to constantly rehash this every time we talk about the 99% of the DID use cases that decentralized apps and services represent.

The notion that somehow a GUID appearing in a substrate === it is still active and represents you is one I simply reject. You can deactivate it, and it is now completely insecure and unreliable to assert that it represents someone. To illustrate, let's use the example provided:

It is like carving your phone number in granite; sure, what you say on that phone line can change, but the number itself never can.

A phone number != a Subject, a phone number is an ID that links you, the caller, to some entity who is supposedly the owner of a device it routes to. You can carve that number into stones on as many mountains as you'd like, but if I deactivate my cellphone and stop using that medium of communication, I don't know what the number connects to, but it sure as heck isn't me. Furthermore, in this DID case, I am also issuing an enforced directive to the phone network (DID Method) that ensures anyone who resolves that number is explicitly informed that it is no longer connected to me, and you'd be crazy to assume it still represents me.

mwklein commented 3 years ago

The phone number == a subject when the element of time is included (it is not possible for a phone number to be allocated to more than one person at a single time). For the period of 2018-2019, the phone did represent you and this is something that cannot be undone. While in 2020 the phone number may no longer represent you, it does not change the fact that the phone number DID represent you and cannot be removed the granite stone.

csuwildcat commented 3 years ago

The phone number == a subject when the element of time is included (it is not possible for a phone number to be allocated to more than one person at a single time). For the period of 2018-2019, the phone did represent you and this is something that cannot be undone. While in 2020 the phone number may no longer represent you, it does not change the fact that the phone number DID represent you and cannot be removed the granite stone.

The Subject the 'phone number' (DID) represents at any point is something that cannot be determined by the ledger/DID Method, it's a claim that another party makes about an interaction they took part in. This is a complete goalpost move of what this entire thread is about, because the fact remains that the DID Document, service endpoints or not, is not what will contain identity data and real world associations.

jonnycrunch commented 3 years ago

@msporny as you point out I did not formally object to the resolutions. I voted neutral (0) and simply emphasized that more discussion is needed. Given the amount of discussion documented in this thread, I don't seem to be far off base. I defer to the Chairs regarding how resolutions in special topic calls are handled and brought to the larger group for discussion. My understanding was these "resolutions" are non-binding and simply a straw man to document a broad view of consensus on the special topic call and that discussion would be summarized and conclusions brought back to the larger group for broader discussion. NOT that it served as a binding agreement that interested parties were required to object to under a specific time window. In this sense I think you overstepped your authority in making such an ultimatum. Quoting @iherman "We should remember that any resolution taken on this call is not binding, i’ts not the WG call … the WG call must decide"

bumblefudge commented 3 years ago

The DIF Glossary Group would like to report out on some directly-relevant recent work. As the endpoint olympics were just warming up, the WG sent out a survey via various community channels about the nature of endpoints, their relative value, and their standardization. These are the results, in the form of a google spreadsheet open for comments.

The responses fell into 3 classes, based on how many (0, 1, 5+) endpoints they seemed to want standardized. We found different conclusions could be drawn from the divergence:

  1. “Notification” endpoints deemed important by many. It would seem the community needs more clarity on DIDCommv2 notification regime; there is also a persistent confusion around DIDComm v1 versus v2, routing and redirection, the scope of Aries, etc.
  2. We were not being clear about what protocols a given service can be assumed to speak or refuse to speak, what transports would be allowed.
  3. The authorization,mediator, and secure data store endpoints are seen as generally desirable.
  4. The No Endpoint option is confusing and led to a nice discussion here <-- (you are here)

Potential next steps:

Thanks, and we hope this perspective is helpful!

dlongley commented 3 years ago

@dhh1128,

The URI of your personal datastore, if PII (e.g., https://mydatastore.com/csuwildcat), cannot be erased from the immutable ledger. What it exposes can be changed -- but if the ledger stores that URI and supports a historical view, there is no way to delete that identifier for you. It is like carving your phone number in granite; sure, what you say on that phone line can change, but the number itself never can. And all data that you once published for it can be linked to the current version, even if that data is no longer visible. You have no "please delete this" recourse.

Yes, thank you. This.

@csuwildcat,

First off, I never said the endpoints need to include identity data or human-friendly info, so I regard the first part about csuwildcat in the URL as a strawman that I will leave to whomever was arguing for that - because it certainly wasn't me. No personal info goes on the ledger, we all agree on this, and I really hope we don't have to constantly rehash this every time we talk about the 99% of the DID use cases that decentralized apps and services represent.

I tried to make the problem clear above. This isn't about whether you're arguing for a human-meaningful URL -- that's great that you're not. It's about whether or not the immutable VDR can tell the difference between a human-meaningful URL and one that is not. Because the VDR needs to prevent the human-meaningful ones from being recorded. Please re-read that sentence -- it's the crux here.

The Subject the 'phone number' (DID) represents at any point is something that cannot be determined by the ledger/DID Method, it's a claim that another party makes about an interaction they took part in.

I agree that a random identifier that can only be made meaningful by connecting it to other pieces of information is different from one that implicitly contains human-meaningful data, such as your full legal name. One important difference is in who is in control of those other pieces of information, and whether the burden could potentially shift to them. A GDPR related ruling found that an IP address recorded by a German government website can be considered PII because the German government had the ability to demand an ISP reveal the person behind it. DID Methods generally don't have this sort of authority or capability, and this changes the calculus for certain identifiers, IMO.

Consider a DID method's immutable VDR that represents the canonical registry of DIDs for that method, and that only allows you to store DIDs and cryptographic material -- and no human-meaningful identifiers. Here, all other potential linkage happens externally from that system. This may be considered acceptable under certain legal regimes and, given a public interest in the need for the persistence of such identifiers, the fact that the only way to ensure this is to prevent their deletion may allow yet another avenue for a "right to be forgotten" exception.

However, these arguments are hard to make for identifiers that are not merely meaningful because of "linkage" and that do not originate from the DID method themselves, but are identifiers that themselves contain PII. Without some other line of argumentation justifying their presence, these should stay off of immutable VDRs.

Again, this means that an immutable VDR needs to either be able to determine the difference between a human-meaningful service endpoint URL or disallow all service endpoints. This is the problem I'm trying to highlight to you and you seem to be ignoring/overlooking/dismissing it.

One solution to this problem is to turn to a different decentralized registry (or protocol) that does not have the same immutability requirements. Such a registry could be a common place to look for VCs from DID controllers that express service endpoints. Such a registry could also support multiple DID methods (or potentially even be DID method agnostic). I know you have stated that you don't want "yet another registry" -- but I keep trying to highlight that the requirements are different for the different registries. That's why we may need "yet another (decentralized) registry/protocol" if the crawling use case is important. I think the non-crawling use cases can be handled in other ways as previously mentioned.

csuwildcat commented 3 years ago

One solution to this problem is to turn to a different decentralized registry (or protocol) that does not have the same immutability requirements.

The use cases themselves demand the convergence of what you may view as competing requirements (I don't), thus moving the endpoints doesn't change anything for me. There are also a host of game theoretical issues with doing this, but I don't think it will profit us to go into them on this thread.

this means that an immutable VDR needs to either be able to determine the difference between a human-meaningful service endpoint URL or disallow all service endpoints. This is the problem I'm trying to highlight to you and you seem to be ignoring/overlooking/dismissing it.

No, it means I reject the assertion that this is what a DID Method must do and I reject the assertion that what is being prescribed here is the right course. The tech should remain open, flexible, and generative - we can haggle with other parties about how it is used in the meatspace venues where those debates belong.

I am fine if we just agree to disagree. Some folks here can use features others choose not to, and that's perfectly fine.

OR13 commented 3 years ago

There will always be another registry... the question is, how much information in a decentralized public registry should be usable to correlate with private ones....

The more stuff we put in a VDR that is not public key material and deterministic transformations (hashes) of it.... the easier it is to create de-anonymized registries... which have value proportional to the use of the "pseudonymous" ones... consider that DHS would never have funded the monero tracing work if monero had no value.

I'm not opposed to people having bitcoin based dids with service endpoints, and i'm not opposed to people tweeting their home router IP address... I would not recommend it.... but maybe you are running a honeypot and want to catch the badies that crawl the public registry....

end of the day, we can't prevent people from having facebook accounts, tweeting private keys, or connecting to their dark market web server without TOR and VPN... sometimes we like when people make security mistakes, it makes them easier to hunt :)

I don't consider a data model a security mistake... describing what a private key is, does not create a vulnerability... in fact, security through obscurity (generally considered to be a bad thing) relies on NOT telling the attacker exactly how the system works...

However, there is a difference between security through obscurity and least privilege... if someone doesn't need to know a service endpoint, they shouldn't.... and if the whole world needs to know a service endpoint, thats ok too, but thats probably never true / lazy security engineering.... just make sure you know how to secure it, or pay a big cloud provider who understands security, enough money to make sure it is secured and available.

If you can't secure firearms, you cannot have them.... you would be a danger to yourself and everyone around you. The same applies to a did document with a service endpoint on public immutable VDRs.

my conclusions from this thread:

  1. we should define an abstract data model for services like we have for verification methods
  2. we should warn about them in the privacy and security considerations
  3. we should warn about them in an implementation guide (if one ever gets created... if not... good thing we are committed to 2).
jandrieu commented 3 years ago

I've taken some time to read through this. I was hoping there might be some movement towards a consensus--and I don't think I'm in the larger consensus on what I'm about to say--but I don't see one emerging. So, apologies for taking this even further afield...

First, I agree with Manu that service endpoints in the DID Doc are an anti-pattern. I have raised my privacy concerns before, with limited result.

Service endpoints impact privacy not just because people will inadvertently, or unwillingly be forced to, put PII in service endpoints. They will, and they will be.

They are a problem because of they correlate the Subject with specific, concrete services. Even a single service endpoint will inevitably lead to unintended discoveries. However, multiple service endpoints is even worse: we are not only providing a correlation between the Subject and these services, we are correlating these previously unrelated service endpoints with each other when we place them in the same DID Document.

While this is fine in a context where the user has explicit management over access authority, it is NOT fine when the architecture itself is designed to be publicly accessible. DIDs are useless if you can't resolve the DID, so there is a natural demand for DID Documents to be publicly resolvable. However, encouraging the publication of service endpoints through DID Documents is, by definition, encouraging systematic publication of correlatable information.

More importantly, it is a trivial matter to separate the identifier proofing mechanism of DIDs from the service discovery. To be clear, I totally get the value of the directory that handles service discovery. There is a reason that Google has a higher market cap that the entire DNS infrastructure industry combined. That value proposition--the directory--is dangerous both because it will encourage rent-seeking DID Method operators to force all of their DID Documents to support directory functionality out of the box and because it is an unsolved problem to do decentralized discovery. So, I get the perceived value in shoehorning directory capabilities into the DID Document. But it is 100% possible to separate the layer of DIDs, DID Documents, & the resulting provable control of an identifier without reliance on a trusted third party, from the directory capability that is so desired.

The most privacy preserving approach would be to create twice-wise unique DID for every accessor to every service endpoint. You want my phone #, I mint a DID that will let you reach me at a particular endpoint, which I can later disable at any time. You want my email address, I mint a different DID that lets you reach me at a different endpoint. There is no reason that you need to use the same DID that is on your driver's license to share your physical address.

If we don't build this architecture out to not only encourage that level of privacy, but to engineer the appropriate UX and technical safegaurds, then DID Method operators and DID users are going to take the easy way out, jam everything into the same DID Document and completely undermine the privacy that could have been possible.

The only use case that has been able to withstand this scrutiny is that of the portable data hierarchy. That is, the ability to have a DID that acts as the root of a hierarchical set of resources which can be moved from service endpoint to service endpoint without breaking URLs based on those DIDs.

THAT use case demonstrates a compelling argument for a single service endpoint which enables dereferencing to a portable, yet arbitrarily complex hierarchy of resources. But it does not support the requirement for MULTIPLE service endpoints. They only use cases that seem to support that have been directories or directory-like services. (Please correct me if I'm wrong).

So, I can support a compromise of a single service endpoint which itself serves as the point of authorization and consent for accessing actual end resources. That would allow this sort of portable resource hierarchy without undue burden.

DIDs can work without directories. They can also work with directories. There is no need to bake the directories into DIDs. Since this can be separated, I argue that they should be.

I'll go even more afield from consensus and point out that much of the privacy problems we have with DIDs is a result of our languaging and mental model that DIDs refer to a Subject.

Yes. The mistake we are making is framing DIDs as referring to a Subject.

DIDs are symbols. Labels. Identifiers. As such, they can be used to label anything and the meaning of those labels can change over time, intentionally and unintentionally. This is the nature of language itself, as explored exhaustively by Shannon, Goedel, and Chomsky. You can see this in modern discourse today where certain names and labels are "dog whistles" that a subset of the audience interprets in very specific ways--but which appear to have benign meaning to outsiders.

As such, it is fundamentally unknowable by an outsider what a DID actually refers to.

What you can do is use DIDs for a proof-of-control ceremony that mathematically proves that a party has access to presumably secret key material. This proof-of-control is the hook on which DID-based verifications, authentications, and authorizations are based. NOT on the proof that the candidate-in-question is a particular Subject, but rather whether or not they have access to intentionally secret information.

THIS proof-of-control ceremony is absolutely required for DIDs to fulfill the vision we have for them. Directories are not. I don't need to be in a directory to prove control over a cryptographic identifier. That's the whole point. Centralizing directories should be options at a different layer, just like DNS is a different layer than IP. Encouraging DID Methods to set themselves up as directories is the real anti-pattern, one that conflates layers in the architecture that should remain separate.

So, yeah, Service Endpoints in the DID Doc is an anti-pattern. If we have to have them, we should restrict it to a single service that MUST be capable of handling access in such a manner as to support privacy, both regulatorily and ethically. And I will support any work people are doing to solve that problem of decentralized, privacy-respecting discovery.

I'd rather there be NO service endpoints in a DID Document, but I can live with just one. I oppose encouraging multiple service endpoints and, in particular, will continue to advocate against that practice on privacy grounds, both within and outside the working group.

On a more pragmatic level, I don't believe we can resolve the privacy issues of service endpoints in the timeframe we have to publish a DID spec. MAYBE what some of this group wants could be accomplished through means I have yet to discover, without additional privacy burden. MAYBE. But we can provide a cryptographically secure way to prove control over an identifier without reliance on a third party in the time available. I'd much rather we focus our attention on the work that is actually tractable given our timeframe.

Perhaps we should have a topical call where we work through those particular use cases where people believe service endpoints are required to be in the DID Document. I have yet to see anything that is actually required, but have seen much that makes certain business models convenient. Let's work through it--like we did with the portable hierarchy use case--and see if the actual desired value, in fact, depends on service endpoints in a way that can't be realized by a separate directory layer.

dhh1128 commented 3 years ago

@jandrieu That is a really helpful comment. Thank you. It has stimulated some neurons in my brain, I think.

I have been arguing that correctly implemented service endpoints are necessary because they allow cryptographic control of the DID to extend to control over metadata about that DID. Essentially, I want it to be possible to perform the rough equivalent of a database transaction, where keys and metadata are changed together, atomically, or are not changed at all -- or at least I want a strong guarantee of ordering between key updates and metadata updates, such that I always know with 100% certainty which keys are in control when metadata is changed (thus allowing those keys to reliably authorize the metadata update). It is intolerable, IMO, to have a system where a company's public DID keys are guarded in a vault behind 9 layers of protection, but the webpage that announces how you can talk to the company using that DID can by hijacked by DNS, a CDN operator, or the admin of your load balancer or web server.

It is this feeling that made me reject @msporny 's assertion that there are perfectly good ways to communicate endpoints already. However, what you (Joe) said about a single endpoint is resonating partly for me. This is exactly how DIDComm works: it is a single endpoint that takes its security from DID key control and that allows you to discover all other services and run all other protocols. Privately.

Now, maybe DIDComm is too politically encumbered to be the single endpoint you said you could tolerate. That's not the thrust of my comment; I'm just mentioning that what you described is pretty close to something with a concrete impl today. And that got me thinking...

What if the real problem here is that we need to split the construct we currently call a "DID doc" into two pieces: a "DID control doc" (pure control key data) and a "DID descriptor doc" (metadata), and what if we stipulated that:

  1. DID control docs are exclusively for the keys used to prove control of the DID (authentication). No other verification methods are allowed.
  2. All other verification methods, service endpoints, and whatever else someone wants needs to go into the DID descriptor doc.
  3. Changes to the descriptor doc need to be justified (authorized) by a specific version of the state in the DID control doc.

Because of item 3, it now becomes possible to publish DID descriptor docs anwhere, anyhow -- as long as the published data is accompanied by a signature. Ledgers can provide this feature, but so can websites, private chat channels, etc. Also, descriptor docs can be crawled, if people feel like putting them in a public place. Discovery is never a use case for DID control docs, however.

I think if we gave metadata some reliable place to go, rather than just declaring it an antipattern, the resistance to eliminating serviceendpoints from a DID doc would evaporate. At least, that's true for me. But I haven't thought this through deeply, so I'm not advocating this answer strongly--just thinking out loud. What other ideas does that spark among readers of the thread?

dhh1128 commented 3 years ago

tagging @SmithSamuelM and @kdenhartog and @peacekeeper re. the comment above. ^^ You can see how this relates to KERI principles and to some stuff we just discussed about network-of-networks interop.

agropper commented 3 years ago

I really appreciate @jandrieu review and maybe I understand @dhh1128. I would be happy to converge on a single, optional serviceEndpoint that points to a policy decision point. That way, when I use a DID to authenticate using my private keys I am also able to provide a place for the requesting party RqP to try and continue the "first contact".

In no way should this one service endpoint be a directory. The directory, if one is involved is how the RqP discovered my DID in the first place. The RqP can present their credentials, desires, and protocols if they want based on whatever public information I choose to post or not at the serviceEndpoint.

I don't see how I can get this ability to keep my private policies as secret and as reliably bound to my private keys any other way.

In addition, we might normatively allow the one service endpoint to be a Mediator where I can decide if the First Contact looks like a communication, an authorization, or maybe a data store. I don't see the harm in allowing "turtles all the way down" but it could lead to the kind of privacy problems that @jandrieu is describing to us so I'm happy to say that the only service endpoint is a PDP.

SmithSamuelM commented 3 years ago

For me the privacy question comes down to how you are disclosing PII. If I am using KERI and I want to disclose PII in a private way then I would never use a Public Mechanism to do so. If a DID is resolvable through a public DID resolver and that provides a DID:DOC then I am hosed. I already made my private information unprivate. Its too late.

If on the other hand I am using public infrastructure (DID Ledgers, DID resolvers) to disclose public information that later by other means becomes correlated to PII then all I am doing by restricting what I get to put in the public infrastructure is attenuating the correlation coefficient. I don't prevent it I just slow it down. And once it becomes correlated I want to de-correlated it. So if I want to de-correlate it then I have to be able to erase either the correlated PII data or the correlating data. In my public infrastructure. If the correlating data is on an immutable ledger than cannot be erased then I have to erase the PII data which I can't because well its PII and erasing it means erasing me. So ultimately we have to be able to erase correlating data which means absent some more clarity from the Data Rights Privacy regulators means we shouldn't be using ledgers because well they are immutable. So this is the cognitive dissonance of ledgers. They are inherently un-de-correlatable PII privacy violations. Its not if but when. So I understand the tension. Anyone using a ledger for their DID is faced with the problem of attenuating as much as possible correlating data. But its not a solution its just a half-measure. Eventually all data on an immutable ledger will be correlated. Its just the correlation time constant that you arguing about.

Because KERI consists of un-intertwined hash chained data structures, (i.e. not ledgers in the conventional sense of the word) the correlating information can be erased. You can de-correlate.

To @dhh1128 point there is no point is having DIDs if we cannot use them to securely bootstrap the exchange of information. Given that the internet security model (DNS/CA) is broken we will only ever be able to fix it by replacing it with a better security model. So saying service endpoints can be had by other means is a dissimulation. It presupposes that service endpoint can be had by other means the are equivalently secure to the DID mechanism. Its punting the problem without resolving it. Its saying we pretend we can get their by some other means so lets not address the issue head on. So we either need to provide a secure layer that provides communication parameters that is secure or we have just given up on the problem. Its like we have been saying all these years, DIDs are more secure, decentralized private etc. But then we go nope. The internet is all we really needed all along. This is the same type of reasoning that resulting in multiple DID:methods. Punting the hard problem so we can focus on the easy one.

dlongley commented 3 years ago

@dhh1128,

I have been arguing that correctly implemented service endpoints are necessary because they allow cryptographic control of the DID to extend to control over metadata about that DID. Essentially, I want it to be possible to perform the rough equivalent of a database transaction, where keys and metadata are changed together, atomically, or are not changed at all -- or at least I want a strong guarantee of ordering between key updates and metadata updates, such that I always know with 100% certainty which keys are in control when metadata is changed (thus allowing those keys to reliably authorize the metadata update). It is intolerable, IMO, to have a system where a company's public DID keys are guarded in a vault behind 9 layers of protection, but the webpage that announces how you can talk to the company using that DID can by hijacked by DNS, a CDN operator, or the admin of your load balancer or web server.

It is this feeling that made me reject @msporny 's assertion that there are perfectly good ways to communicate endpoints already.

If this wasn't clear before, the assertion was to use VCs to express any information (such as service endpoints) beyond the DID itself, verification methods, and verification relationships. So we already have a mechanism to do what you discuss thereafter -- and we don't have to change anything, I don't think. All we need to do is encourage people to express service endpoints using VCs -- and proofs on those VCs can be checked against assertionMethod verification methods from the DID document. The abstract data model for service endpoints allows for them to be expressed in VCs in a supported syntax.

dhh1128 commented 3 years ago

That does not magically resolve my concern Dave. The issue is that I don't believe individuals as issuers is a good idea from a privacy perspective with any of the solutions we have today. So we have a loss of functionality.

dlongley commented 3 years ago

@dhh1128,

The issue is that I don't believe individuals as issuers is a good idea from a privacy perspective with any of the solutions we have today.

I don't understand how using a VC that states the DID as the issuer and includes only the service endpoint is meaningfully different from publishing a service endpoint in a DID document directly -- from a privacy perspective. Other than that the VC doesn't have to be on the immutable VDR, of course. What are the privacy concerns?

dhh1128 commented 3 years ago

I might be able to get around my concern if every single interaction where an individual wants to give such a VC is a new VC, rather than reused. But I think in some ways that's exactly the same privacy temptation that disclosing PII on a ledger constitutes: I think people are likely to do it wrong.

Setting aside the privacy question, I feel funny about creating a technical dependency where it is reciprocal. DIDs depend on VC's, and VC's depend on DIDs...

dlongley commented 3 years ago

@dhh1128,

VC's depend on DIDs...

VCs do not depend on DIDs.

I might be able to get around my concern if every single interaction where an individual wants to give such a VC is a new VC, rather than reused.

I would expect this to be common in a lot of cases, actually, to turn over a brand new, short-lived VC through a communication channel that has been established to transfer the DID itself as well. If the VC is going to be longer lived, then that's the sort of VC that would live on these decoupled registries that have been discussed in this thread anyway. It would carry the same sort of properties that putting a service endpoint directly into a DID document would, except that it could expire, and be deleted, and so on.

jandrieu commented 3 years ago

@SmithSamuelM I'm not sure I understand this assertion:

So saying service endpoints can be had by other means is a dissimulation.

If I can use the DID Document to verify the authenticity of a affirmative statement about endpoints (e.g., through signature or encryption), wouldn't that make the security independent of the transport? And hence it is perfectly reasonable to assert that "service endpoints can be had by other means"?

I share that goal of securely bootstrapping communications. But that doesn't mean my service endpoints need to be in the DID Document. It just means that the ability to verify the authenticity of an initial assertion about service endpoints must be possible with information in the DID Document.

In fact, when we encourage service endpoints to be in the DID Document--and that DID Document is unsigned per the spec--then we are implicitly deferring trust in the authenticity of that document to the DID Method. So let's be clear: any DID Method could insert a service endpoint without the controller's intention. So... putting endpoints in the document expands the authority question rather than removes it. Which could be removed if we added signatures to the DID Document, but that seems like a bigger change than we can muster at this stage.

I'd suggest that we minimize the dependency we place on the DID Method, and rely on them only for the MINIMUM amount of data required to securely bootstrap communications, which to my mind is the ability to cryptographic verify a single piece of communication. HOW that piece of communication gets communicated is a protocol for a different spec. I could write it on a wall. I could put in a web site. I could use smoke signals. HOW you find my service endpoint: whether or not I give it to you or someone else does, is a different problem from verifying that the service endpoint is, in fact approved/intended/open for use for that DID.

Which raises one potentially interesting value point for some DID Methods: they provide a way to know that at a given definition of NOW, what the authoritative state is for a given DID. You could timestamp a given assertion of endpoints (such as by putting a hash on a ledger), but that would only state a point in time after which we can know that the assertion existed. We cannot know that the assertion has been superseded by a subsequent assertion, which is the equivalent of not knowing if there is a newer DID Document.

So, for ledger-based DIDs, there might be value in providing in a DID Document a hash of a VC containing "current endpoint specifications". Then, you if you get ahold of that VC, you can verify that it is current.

Anyway, I'm curious, Sam, if I'm understanding what you meant about getting service endpoints from somewhere other than the DID Document.

dhh1128 commented 3 years ago

@dlongley :

VCs do not depend on DIDs.

I agree that this is technically true, as the VC spec allows identifiers for issuers (required) and holders (optional) to be URIs or identifiers having the same properties as DIDs. But AFAIK, that distinction only exists in theory, not in practice. How many members of this group have built VC handling stacks that are DID-free? My concern is practical.

DIDs are low-level plumbing, only one step above cryptography. VCs are a higher-level construct with (IMO) a much stronger affinity for JSON-LD, extensible schemas, rich semantics, and so forth. If I need to be able to issue and verify credentials to talk to someone who has a DID, it feels to me like we have a complexity and dependency inversion problem.

agropper commented 3 years ago

I'm lost.

What does Bob do with my DID?

On Tue, Sep 1, 2020 at 7:20 PM Daniel Hardman notifications@github.com wrote:

@dlongley https://github.com/dlongley :

VCs do not depend on DIDs.

I agree that this is technically true, as the VC spec allows identifiers for issuers (required) and holders (optional) to be URIs or identifiers having the same properties as DIDs. But AFAIK, that distinction only exists in theory, not in practice. How many members of this group have built VC handling stacks that are DID-free? My concern is practical.

DIDs are low-level plumbing, only one step above cryptography. VCs are a higher-level construct with (IMO) a much stronger affinity for JSON-LD, extensible schemas, rich semantics, and so forth. If I need to be able to issue and verify credentials to talk to someone who has a DID, it feels to me like we have a complexity and dependency inversion problem.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/did-core/issues/382#issuecomment-685183672, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YMXCNI7HVZPAUIEOU3SDV6SHANCNFSM4QOKAB3A .

wyc commented 3 years ago

Drawing some examples to further this line of reasoning from @dhh1128

What if the real problem here is that we need to split the construct we currently call a "DID doc" into two pieces: a "DID control doc" (pure control key data) and a "DID descriptor doc" (metadata)...

For discussion purposes only, not a recommendation from me at this time.

DID Control Document

{
  "@context": "https://www.w3.org/ns/did/v1",
  "id": "did:publicchain:0x89205A3A3b2A69De6Dbf7f01ED13B2108B2c43e9",
  "authentication": [ ... ],
  "srv": "didsrv:gdprcompliantchain:0x89205A3A3b2A69De6Dbf7f01ED13B2108B2c43e9" // optional
}

DID Service Document

{
  "@context": "https://www.w3.org/ns/didsrv/v1",
  "id": "didsrv:gdprcompliantchain:0x89205A3A3b2A69De6Dbf7f01ED13B2108B2c43e9",
  "service": [
    {
      "id": "didsrv:gdprcompliantchain:0x89205A3A3b2A69De6Dbf7f01ED13B2108B2c43e9#openid",
      "type": "OpenIdConnectVersion1.0Service",
      "serviceEndpoint": "https://openid.example.com/"
    }
  ],
  "proof": {...} // OR whole thing as JWT
}

Pros

Cons

Other thoughts

dhh1128 commented 3 years ago

I don't understand how using a VC that states the DID as the issuer and includes only the service endpoint is meaningfully different from publishing a service endpoint in a DID document directly -- from a privacy perspective. Other than that the VC doesn't have to be on the immutable VDR, of course. What are the privacy concerns?

One is repudiability. VCs are not repudiable (unless you use ZKPs, and even then, it's only an option, not the default). Unsigned DID docs (e.g., as shared with peer DIDs in DIDComm) are. What this means, in practical terms, is that when Alice receives Bob's DID doc, she can't prove to Carol with the DID doc itself that it's truly Bob's DID doc; unauthorized sharing loses assurance unless Alice archives a transcript of her interaction with Bob and plays it back to Carol. That transcript might include safeguarding terms and conditions, for example. But when she receives a VC from Bob about his endpoint, she can prove to Carol, without Bob's permission, that it's Bob's endpoint. She just has to display the VC in isolation.

Another issue is revocation. When a key on a DID is rotated, it does not retroactively invalidate all the signatures that the key created. Thus, rotating the key doesn't invalidate VCs that claimed the service endpoint was X. To me, that suggests that we have a feature gap that, AFAIK, can only be plugged with real VC revocation. I'd like to believe that we don't need revocation of VCs that convey service endpoints -- but if we do, then we are asking individuals who want DIDs to manage revocation lists, or to expose a phone home service. Either has privacy implications (not to mention logistical problems).

I don't know if either of these matter a lot. I'm not claiming to have done a deep analysis of this problem. I am just challenging the easy claim that VCs-with-endpoints are a direct equivalent of endpoints-in-did-docs. I think privacy is likely to be a dimension where differences surface. Maybe I'll do some more thinking about this, and come to different conclusions.

jandrieu commented 3 years ago

@dhh1128 you say

When a key on a DID is rotated, it does not retroactively invalidate all the signatures that the key created. Thus, rotating the key doesn't invalidate VCs that claimed the service endpoint was X.

Why do you say this? As in, is there spec text I missed that actually states or implies this?

To my understanding if I get a VC created by an Issuer using a DID, signed by something that does not have a matching entry in that Issuer's CURRENT DID Document, it will fail verification.

My expectation is exactly the opposite of what you said. Did I misunderstand what you meant?

talltree commented 3 years ago

Whew! What a thread. I just read the whole thing because on this morning's DID WG call this issue was pointed to as "the outcome of the special topic call on service endpoints". I had no idea it had grown to this length.

I will keep this short. I hate to say it, but so far the content in this thread almost completely misses the two main reasons for keeping service endpoints in the spec provided that we include all appropriate privacy flags and warnings of course.

Reason Number 1: Public DID documents for public entities (like corporations, governments, NGOs, universities, churches, websites) who want to publicly advertise not just their DID and keys but their service endpoints. For these entities:

  1. We want to make it really easy.
  2. We want to make it one-stop/one-hop to post the original DID document and make updates.
  3. There are no GDPR or other privacy concerns.

Reason Number 2. Innovation. Who are we to say that we know all the right and wrong ways to safely use a service endpoint? DIDs are just entering the world. DIDComm is still an infant at the crawling stage. Yes, we should provide all the privacy warnings and guidance that we can. But to suggest that we remove the feature or artificially restrict a DID document to a single instance of a service endpoint seems a little like TBL predicting all the things URLs will be used for back in 1994.

Unless someone can decry these two motivations for service endpoints—which is why they have existed in the spec since the first draft four years ago—I suggest we move forward with Orie's suggestion:

  1. we should define an abstract data model for services like we have for verification methods
  2. we should warn about them in the privacy and security considerations
  3. we should warn about them in an implementation guide (if one ever gets created... if not... good thing we are committed to 2).
csuwildcat commented 3 years ago

@talltree:

ezgif-7-09a6b9b3daf1

SmithSamuelM commented 3 years ago

@jandrieu

So saying service endpoints can be had by other means is a dissimulation.

If I can use the DID Document to verify the authenticity of a affirmative statement about endpoints (e.g., through signature >or encryption), wouldn't that make the security independent of the transport? And hence it is perfectly reasonable to assert >that "service endpoints can be had by other means"?

But those other means must be defined as part of some standard for interop sakes not merely hypothetically referenced. That’s my point about punting.

I share that goal of securely bootstrapping communications. But that doesn't mean my service endpoints need to be in the DID Document. It just means that the ability to verify the authenticity of an initial assertion about service endpoints must be possible with information in the DID Document.

I agree but the DID spec is useless for DIDs if we don’t do that someplace. Indeed in other places I have suggested the tension with DID Docs is that instead of a one place fits all model we should be using a layered approach.

Layer 0: Control Establishment (authoritative signing keys for DID) Layer 1: Authorizations of communication parameters (routing and encryption) and service endpoints Layer 2: Other stuff aka DID Doc or Ersatz verifiable credential. (Because layer 0 and 1 have bootstrapped us)

So I would wholeheartedly support a layered approach. But if we are just punting the problem then we are not doing anybody any favors. It’s an abrogation of our responsibility to the concept of DIDs to punt communication parameters and service endpoints.

In fact, when we encourage service endpoints to be in the DID Document--and that DID Document is unsigned per the >spec--then we are implicitly deferring trust in the authenticity of that document to the DID Method. So let's be clear: any DID >Method could insert a service endpoint without the controller's intention. So... putting endpoints in the document expands the authority question rather than removes it. Which could be removed if we added signatures to the DID Document, but that seems like a bigger change than we can muster at this stage.

Whoa! DID documents are unsigned. When did that happen? If so then they would be useless.

SmithSamuelM commented 3 years ago

@jandrieu @dhh1128

My expectation is exactly the opposite of what you said. Did I misunderstand what you meant?

This is a very revealing comment. If the DID Doc spec is so ambiguous that there is not a clear understanding of how authoritative statements work wrt to rotating keys then we have messed up big time.

One of the reasons I wrote KERI was to precisely define control establishment in rigorous way. Control establishment must include key rotation and what that means in terms of control establishment. A reasonable rule which KERI employs is: 1) a signed statement using the current authoritative set of keys at the time of the signature is valid until revoked or rescinded. This means that merely rotating keys does not revoke or rescind the validity of prior signed statement. Otherwise every time your rotate keys your would have to reaffirm (reissue) every prior statement signed with the now absolute keys. This is an impractical rule. So in general the rule 1) is a the most reasonable rule but certainly not the only possible rule.

The alternative is as follows: 2) All statements issued/signed with a given set of keys are automatically revoked when the authoritative keys are rotated.

As a side note I had a discussion with a security practitioner who asserted that one could not use rotatable keys to issue verifiable credentials because then one would have to reissue every verifiable credential issued with the old keys. This person was assuming rule 2)

This view (rule 2) of mandatory revocation (reissuance) is a common rule in token based security approaches where all tokens issued under a given set of keys are automatically revoked when you rotate keys.

In order for Rule 1) to be practical one needs to maintain a log of statements signed with a given set of keys or at least a cryptographic commitment to the hashes of the log of statements (merkle tree or hash chained data structure) so that one can verify that a statement was signed with the then authoritative set of keys.

So if one is not using a log (ledger, etc) of signed statements then Rule 1) is unworkable and Rule 2) is the reasonable one.

A hybrid would be:

Rule 3) Only logged signed statements use rule 1) and all other signed statements use rule 2) Any presentation of a signed statement includes a reference to its location in the log to determine the authoritative keys at the time (location) in the log. If log reference is absent then one checks the current authoritative keys and if they differ then the signed statement is stale (invalid).

Clearly the issuer-holder-verifier model of VCs is problematic with rule 2) especially at scale issuance of large numbers of credentials especially credentials that are time expiring because your have coupled your key management (rotation recovery) to the expiration rules for your VCs.

Thus rubric for a DID method would include which rule 1) 2) or 3) is to be used to verify signed statements associated with the keys for that DID including the DID:Doc

But in order to support 2) or 3) a verifiable log is required.

Each DID method should explicitly define which rule 1) 2) or 3) is to be used when verifying signed statements. As far as I know no DID method explicitly does this. Its implied.

SmithSamuelM commented 3 years ago

@dlongley :

A DID on its own does not necessarily identify a person. This depends on its use outside of the VDR. However, a URL that includes a person's full name identifies a person, all on its own. @dhh1128 A DID that has as its subject a person is PII, according to legal experts who've studied PII+GDPR+SSI carefully. (Or perhaps more precisely, experts I've talked to say that they believe legal rulings will eventually formalize this legal conclusion.) The fact that some DIDs have subjects that aren't individuals is irrelevant. Putting a DID that identifies a person onto a public ledger is putting PII onto that ledger, even if it is not obvious to an outside observer that the DID in question has an individual as its subject. Obviousness is not a definitional criterion of PII, and does not eliminate the right-to-be-forgotten requirement.

+1. GDPR considers cryptographic hashes and identifiers PII when they are correlated to PII. This is severe problem for all ledgers that use DIDs. to arbitrarily consider service endpoints as potential PII but ignore that fact that the DIDs themselves are potential PII is a problematic view.

SmithSamuelM commented 3 years ago

@dhh1128 Key rotation versus signed statement revocation. The authority of a signed statement is imbued to it by its signature and the keys used to create the signature. Is a signed statement authoritative/authorized after the keys used to sign it have been rotated? If not then the statement is effectively revoked as not longer being an authoritative/authorized statement. If the statement is still authoritative/authorized after the keys used to sign it have been rotated then is it not effectively revoked by the rotation itself but requires a separate signed revocation statement the rescinds/revokes its authoritative/authorized status. This revocation statement is signed by the current set of authoritative keys that may be different from the keys used to sign the statement being revoked.

Authorization tokens which are a form of signed statement often employ the rule 2) that when the keys used to sign the token have been rotated that this implies that the token’s authorization is revoked. Effectively the token is always verified by the current set of signing keys so it will fail verification after rotation. Whereas in Rule 1) the verification is w.r.t the set of signing keys used to create the signature at the time the statement was issued and signed. This means the verifier has to have a way if determining what the history or lineage of control authority was via a log or ledger to know that a statement was signed with the authoritative set of keys at the time. This means that the log or ledger must not only log the lineage of keys (key rotation history) but the statements signed by those keys (a digest of statement is sufficient). Otherwise a compromise of the current signing keys (which rotation protects from) would allow an exploit to create verifiable supposedly authorized statements after the keys have been rotated. So it either must be rule 1 or 2 or 3. And non-automatic revocation of signed statements requires a log of both the key rotation history and signed statement history.

Obviously if keys are not rotatable, then any signed statement may not be revoked by merely rotating keys but instead a revocation registry may be used to determine if a signed statement has been revoked by explicitly using a revocation statement. So non-rotatable keys may use a modified rule 4) where there is no key rotation history log or signed statement log but merely a revoked statement log. Although typically non-rotatable keys are used for ephemeral identifiers, in which case, revocation log is not used. Instead of rotating keys for ephemeral identifiers you just rotate the identifier (make a new one with a new set of keys) and abandon the old identifier and all its signed statements.

dhh1128 commented 3 years ago

For anyone reading this thread who becomes interested in the side-topic of rotation vs. revocation (what I, Joe, and Sam mentioned), I created a separate issue to move the discussion into its own context: https://github.com/w3c/did-core/issues/386

dlongley commented 3 years ago

@SmithSamuelM,

...to arbitrarily consider service endpoints as potential PII but ignore that fact that the DIDs themselves are potential PII is a problematic view.

Emphasis mine. I agree with your statement here, but the distinction written about above is not arbitrary.

peacekeeper commented 3 years ago

I created an issue in the did-spec-registries which argues that there should not be a centralized registry of services types, I think that's loosely related to this issue here: https://github.com/w3c/did-spec-registries/issues/125

jandrieu commented 3 years ago

@SmithSamuelM said:

GDPR considers cryptographic hashes and identifiers PII when they are correlated to PII.

This is exactly why we should NOT discuss DIDs as identifying particular subjects. Every framing that does so risks compliance complications. Because DIDs do their magic without anyone needing to know the actual subject in any other context (we don't need to know the physical or legal person it refers to). When we think about and advocate DID uses that persistently permanently refer to an individual, as in this note in section 3.1 https://www.w3.org/TR/did-core/#did-syntax :

That is, a DID is bound exclusively and permanently to its one and only subject. Even after a DID is deactivated, it is intended that it never be repurposed.

When we frame it this way, we are begging for DIDs to be treated as PII.

We would do well to avoid that mistake.

DIDs enable demonstrable proof-of-control over an identifier without reliance on a third party. What that identifier is ABOUT is entirely a construct of the statements made about that DID and how those statements are interpreted by recipients. This is fundamental to language. A DID is a signal, which only has meaning in so far as the signaller intends AND the receiver understands. Note that this is NOT about the Controller: the Controller also doesn't get to decide what a DID is about. People using the DID do.

Consider this hypothetical. Consider did:joe:SuperThing which starts out referring to a weekend project. That grows and becomes a business, a sole proprietorship. I later add a partner and did:joe:SuperThing now refers to a partnership. Later we turn the partnership into an LLC. Then, as we lay the work for an IPO, becomes a C corporation. In each of these stages, did:joe:SuperThing is a different, legally distinct entity, even though there is a sense in which the meaning of did:joe:SuperThing is consistent across that lifecyle: its this Super Thing I created. But if you were to incorrectly assume that did:joe:SuperThing referred to any one of those specific legal entities, you wouldn't necessarily be wrong: it did refer to those specific legal entities at different times. You just have to use additional cues to figure out that did:joe:SuperThing is NOT actually the specific legal entity, but rather the conceptual notion of a project with a life of its own. THEN you have to apply that knowledge to interpret statements that may be made about that DID in different stages.

Just as HODL doesn't mean what it's coiner intended.

Just as MTV no longer means what it used to.

Just as the People's House no longer means what it used to.

Just as "Karen" no longer means what it used to. Or Lincoln. Or Christ. Or ANY identifier that has any temporal staying power.

Semantic shift is a fact of human language. So, while some of us deeply crave the illusory certainty that a DID in fact refers to a specific Subject, the fact is that signals refer to whatever the signaller meant them to refer to, and then only when the recipient shares some notion of that same meaning.

What DIDs do allow is demonstration of proof of control, which can be used, in the context of true "secrets", as a form of identity assurance that the entity performing the proof-of-control is the same entity that it was the last time proof-of-control happened.

The nice thing is that this still works with did:joe:SuperThing. The semantic shift doesn't happen when you accept that all proof-of-control means is that the current party controls the proof secrets of did:joe:SuperThing. Which implies the current party is acting as did:joe:SuperThing, but that's it. It can't even be assumed that was asserted as true of did:joe:SuperThing at some point in the past (perhaps embodied as a VC) applies to the current did:joe:SuperThing. Consider the business permit issued to the sole proprietorship using that DID. That permit does NOT apply to the partnership nor the LLC nor the corporation.

I'm boggled by why so many people are literally advocating for features that will make DIDs effectively unusable for individuals due to privacy concerns. Treating DIDs as identifying particular individuals and using the DID Document as a correlation point for information about individuals are problems we should be working to avoid, not "features" to defend.

dlongley commented 3 years ago

@jandrieu -- Could you put together a PR with some concrete text changes to the section of the spec you quoted to address the above issue? I think it would be helpful for the group to debate concrete changes to the particular problem you raised in a PR as you have raised good points. It would be good for us to try and separate that concern off from the rest of what is happening here, we may be able to reach consensus on it more quickly than the issue here, and it may help create a foundation for finding consensus here.

agropper commented 3 years ago

(Trying out the anti-statement strategy that seems to be working in the SDS authorization discussion).

PROPOSED: A DID is just for authentication and related control and security issues. A DID SHOULD not raise privacy issues.

The SHOULD means that if there is any way to avoid Service Endpoints in the DID Document we should do that. We know from the Glossary group work that there are at least a few service endpoints of interest, notification and authorization among them.

If it makes sense to decouple notification and authorization from the DID Document resolution process then maybe we should. That would mean that a DID controller would authenticate to some service provider, (e.g. a secure data store or the car rental company in https://github.com/w3c/did-use-cases/issues/101) and CRUD their notification and/or authorization service endpoint without changing the DID Document.

Is this reasonable? Will it solve our privacy issues? Will it help adoption of SSI?

msporny commented 3 years ago

If it makes sense to decouple notification and authorization from the DID Document resolution process then maybe we should. That would mean that a DID controller would authenticate to some service provider, (e.g. a secure data store or the car rental company in w3c/did-use-cases#101) and CRUD their notification and/or authorization service endpoint without changing the DID Document. Is this reasonable? Will it solve our privacy issues? Will it help adoption of SSI?

Yes, this is what some in the thread are arguing -- it is reasonable and will solve a variety of privacy issues and will help adoption of SSI.

agropper commented 3 years ago

I'm warming up to the idea.

Based on my proposed Alice Rents a Car use-case, Alice's agent might have a DID of its own. Her service providers (in this context, defined as anyone that has agreed to let her authenticate with a DID), would then post publicly what agent protocol they support (e.g. OAuth3 / GNAP) and allow Alice to either: a) register her agent service endpoint itself, or b) register the DID for her service endpoint, Either way, Alice would assume that any service provider (DMV, insurer, bank) asserting Gold Button would be compatible.

In case of a) the service provider would need to verify the capability that was being presented (by the rent-a-car service) was delegated by Alice's authentication DID. How do they do that?

In the case b) The service provider needs to verify the association of Alice's authentication DID with Alice's agent DID in order to verify the capability being presented by the rent-a-car service. How do they do that?

agropper commented 3 years ago

Here's a sequence diagram that separates authentication from authorization. Alice creates a did:key for authentication with a bank service provider. She then registers a semi-autonomous payment agent that supports a mutually acceptable authorization protocol Alice's agent has been pre-programmed with policies that say that any company in the Fortune 500 can be paid $200 or less automatically because Alice is sure she can get her money back in case of dispute.

Later, Alice registers with a rent-a-car service provider using a different did:key. Alice also registers her agent, the same one she registered with the bank. Alice's agent issues a capability to the rent-a-car company that results in payment by the bank and Alice receives a capability to access the car.

There are some privacy issues with this simple sequence. The bank gets to know that Alice is renting a car and Alice's agent endpoint is a correlation risk. However, Alice deems these to be acceptable under the circumstances. The bank's tracking could be mitigated if Alice's agent has access to (digital) cash. The agent correlation risk can be mitigated if Alice uses a mediator to hide her agent endpoint from the rent-a-car.

In the general case, the bank and the rent-a-car are just somebody's secure data stores.

Can we fit a (standard) authorization protocol to complement DID as a pure authentication method?

dhh1128 commented 3 years ago

@agropper and @msporny :

The SHOULD means that if there is any way to avoid Service Endpoints in the DID Document we should do that.

Here is where I diverge. I believe this statement of Adrian's makes a logical leap that is unwarranted. Yes, we should avoid privacy problems with DIDs. But it does not therefore follow that we should take Service Endpoints out of the DID document. Rather, it follows that we should: A) describe service endpoints in a way that preserves privacy, OR B) we should take them out. You are short-circuiting by ignoring the first (A) branch of the ORed statement.

Peer DIDs with service endpoints do not have a privacy problem. They take branch A.

Any DIDs with herd privacy endpoints do not have a privacy problem. They take branch A.

Manu seems to be arguing that there are equally good ways to communicate service endpoints outside a DID doc. I disagree. As far as I can tell, all the ways Manu has proposed so far lose the characteristic that I want, which is the ability to strongly associate a service endpoint value with a particular key state, updating them together or not at all, with a DB-transaction-like atomicity. I claim that without this, hackers can drive a truck through system security.

OR13 commented 3 years ago

I agree with @dhh1128 ... I think that if we don't describe branch A, DID Method Authors will just make up their own way of doing it, which will not be standard, and might not address the security concerns raised by the group.

But I also agree with the SHOULD.... if you don't need an insert arbitrary property in a did document... it SHOULD NOT be there.

agropper commented 3 years ago

@dhh1128 I totally agree with you. That's why I suggested that proponents of removing them "should" fill out the sequence diagram for how we associate a DID without a service endpoint, like did:key with either an authorization, notification, or mediation service.

dlongley commented 3 years ago

@dhh1128,

As far as I can tell, all the ways Manu has proposed so far lose the characteristic that I want, which is the ability to strongly associate a service endpoint value with a particular key state, updating them together or not at all, with a DB-transaction-like atomicity. I claim that without this, hackers can drive a truck through system security.

You seem to be suggesting that if some information X is not atomically bound with a particular key state via the DID Document then there is an insurmountable system security problem.

I've intentionally called this information X here instead of "service endpoint" to highlight that what you're arguing is that all Xs must be in the DID Document. Forget about using VCs, for example -- unless you stuff them into the DID Document. So, I disagree -- and I think, instead, that we've got fairly useless technology if everything we care about from a security perspective has to live in the DID Document. We have to support partitioning or this system will collapse under its own weight.

dlongley commented 3 years ago

It's also worth noting that the identity behind a particular DID is inexorably partitioned from the DID itself already.

dlongley commented 3 years ago

@agropper,

...how we associate a DID without a service endpoint, like did:key with either an authorization, notification, or mediation service.

For example, send someone this VC:

{
  "@context": ["https://www.w3.org/2018/credentials/v1", "some-context-that-defines-service-endpoint-terms"],
  "id": "urn:vc:12321345",
  "type": ["VerifiableCredential", "ServiceEndpointCredential"],
  "issuer": "did:key:z6mczx79123...4234",
  "issuanceDate": "2010-01-01T19:73:24Z",
  "expirationDate": "2010-02-01T19:73:24Z",
  "credentialSubject": {    
    "id": "did:key:z6mczx79123...4234",
    "service": [{
      "id": "did:key:z6mczx79123...4234#service-x",
      "type": "NotificationService",
      "serviceEndpoint": "https://example.com/something"
    }]
  },
  "proof": {
    "proofPurpose": "assertionMethod",
    "verificationMethod": "did:key:z6mczx79123...4234#z6mczx79123...4234",
    "..."
  }
}

You can also send them a zCap to access the service endpoint at the same time:

{
  "@context": "https://w3id.org/security/v2",
  "id": "urn:uuid:14931-24982-23342-423-234342",
  "parentCapability": "https://example.com/zcaps/something",
  "invocationTarget": "https://example.com/something",
  "invoker": "did:key:zm239823432...35423523",
  "allowedAction": ["read"],
  "expires": "2010-01-02T19:73:24Z",
  "proof": {
    "proofPurpose": "capabilityDelegation",
    "verificationMethod": "did:key:z6mczx79123...4234#z6mczx79123...4234",
    "..."
  }
}
OR13 commented 3 years ago

@dlongley thats nice, but I want to be able to crawl all the DIDs in the VDR and build a database correlating them to other websites, people, devices and data sets.... so your use of did:key and privacy preserving approach to this problem gets in the way of my business model... can't we just mandate my ability to make money selling correlation data?

/s

dlongley commented 3 years ago

thats nice, but I want to be able to crawl all the DIDs in the VDR and build a database correlating them to other websites, people, devices and data sets.... so your use of did:key and privacy preserving approach to this problem gets in the way of my business model... can't we just mandate my ability to make money selling correlation data?

Yes, this is a fun joke -- but I also don't want people to think we're being dismissive of their concerns (or lumping them into a group/use case where they don't belong or that they don't support). We're all listening here and trying to find the best way forward (including @OR13 who is an excellent collaborator).

jandrieu commented 3 years ago

@agropper wrote:

@dhh1128 I totally agree with you. That's why I suggested that proponents of removing them "should" fill out the sequence diagram for how we associate a DID without a service endpoint, like did:key with either an authorization, notification, or mediation service.

You simply sign a VC that states the endpoints and make that VC available to those you wish to use those endpoints.

There is no sequence diagram necessary.

From what I can tell, it seems like some of us have a hidden requirement to automatically be able to do all sorts of magic with DIDs, whether that's a directory service or a resource delivery mechanism. These efforts tend to violate the layered architecture that gives DID their privacy-enabling features.

Once you have a viable root authority from a DID Document, it is trivial to secure (in-band) any communications you might have with those acting on behalf of the Subject. In particular, you can secure the content independent of the communications channel.

If you want to look up someone's communication channels, use a directory, with appropriate controls for compliance and privacy.

I for one, don't want every DID that gets created to advertise service endpoints. The only thing that does, IMO, is give data aggregators and bad actors a means to scrape data without my permission.

Automated discovery is the problem, not a feature.

You want to know how to reach me, ask me.

SmithSamuelM commented 3 years ago

@jandrieu There is an assumption being made that any DID Doc is by default discoverable. Its only discoverable if the DID method makes it discoverable. A perfectly good way to "tell" someone when they ask about one's data is to deliver to them a DID:Doc. Did Resolvers do not have any way of discovering a DID:doc unless the controller of the DID:doc publishes to the DID resolver or the DID method pulls it from a public verifiable data registry or ledger. DID resolvers should not cache DID Docs unless they have signed consent from the controller of the DID Doc or unless the did method makes that implicit.

SmithSamuelM commented 3 years ago

This is a case where the right to be forgotten would be enforceable against someone hosting a DID resolver. The verifiable controller of a did doc could request a did resolver that inadvertently or maliciously cached a did doc without consent to delete it. The resolver would be liable under GDPR for not deleting it upon request.