w3c / controller-document

Controller Documents
https://w3c.github.io/controller-document/
Other
5 stars 6 forks source link

privacy review / questions #93

Open npdoty opened 3 weeks ago

npdoty commented 3 weeks ago

We can split this up into additional issues, or if the team just wants to answer questions here, that might help me understand whether there are more specific privacy issues for the spec to address.


The specification is quite abstract, and I think it would help readers and reviewers to have some particular examples about how Controller Documents are intended to be used. The very abstract nature (any kind of data related to any kind of entity) makes it challenging to reason about things like privacy properties. Or if this is intended just for cryptographic key communication, that would be a helpful narrowing of the scope and make implementation/interoperability and privacy/security protection much more straightforward.

Pairwise identifiers is a good, important privacy practice. We don't often use that exact terminology on the Web, where we might talk about the scope of identifiers or connection to the concept of origins. Would it be useful to talk about origin-specific keys or the origin model here?

https://w3c.github.io/controller-document/#keep-personal-data-private recommends that no personal data be included in a Controller Document, but it's not clear that this is a requirement that will be satisfied. Cryptographic keys used by or about a person are certainly personal data.

Also, not a privacy question, but a question I had in trying to understand the use of these documents: what is the difference between id and controller?

msporny commented 1 week ago

I'm responding in my capacity as an Editor and not on behalf of the VCWG. We will try to review this response during W3C TPAC to see if the WG has consensus wrt. the suggestions below.

@npdoty wrote:

The specification is quite abstract, and I think it would help readers and reviewers to have some particular examples about how Controller Documents are intended to be used.

Yes, agreed. We can try to point to the DID Use Cases (as many/all of them apply), or reframe them for this specification. Fundamentally, the use cases are the same except that now any URL can be used in a Controller Document where previously a DID had to be used in several places.

The very abstract nature (any kind of data related to any kind of entity) makes it challenging to reason about things like privacy properties. Or if this is intended just for cryptographic key communication, that would be a helpful narrowing of the scope and make implementation/interoperability and privacy/security protection much more straightforward.

It is largely meant for cryptographic key communication. I say "largely" because one can extend the document to store other information, though what those extensions do is (clearly) out of scope. The main reason this specification exists is that the VC JOSE COSE specification wanted to use a lot of the functionality specified via DID Documents and Data Integrity, but without using DIDs or Data Integrity. We created this specification so they could use these features without having to use DIDs.

Pairwise identifiers is a good, important privacy practice. We don't often use that exact terminology on the Web, where we might talk about the scope of identifiers or connection to the concept of origins. Would it be useful to talk about origin-specific keys or the origin model here?

We talk about it a bit in this section:

https://w3c.github.io/controller-document/#identifier-correlation-risks

We could expand that section to include language on "origin-specific keys" and the origin model. Would that work for PING?

https://w3c.github.io/controller-document/#keep-personal-data-private recommends that no personal data be included in a Controller Document, but it's not clear that this is a requirement that will be satisfied. Cryptographic keys used by or about a person are certainly personal data.

What we were trying to convey were things like full name, home address, phone number, etc. That said, your point is valid. Perhaps we could speak to not including information that could be used to easily correlate you, such as full name or phone number? Or limit it to the bare minimum to achieve your communication goals for that controller document?

Also, not a privacy question, but a question I had in trying to understand the use of these documents: what is the difference between id and controller?

We have a few open issues, namely #33 and #75, where we're trying to get more crisp with that language. Fundamentally, the id specifies the subject of the controller document... that is, the entity that the controller document is about. The controller field can be used to specify other entities that have the right to modify/update the document (which useful in decentralized systems like blockchains, or systems that allow entities other than the subject to modify/update the controller document).

I believe the above largely constitute editorial changes. We will raise PRs for those before entering Candidate Recommendation. Please let us know if you believe that are further issues that would prevent a transition of this specification to the Candidate Recommendation phase.

jandrieu commented 4 days ago

Fundamentally, the id specifies the subject of the controller document... that is, the entity that the controller document is about.

I would disagree with this characterization. I would say the "id" is the unique token for referring to a common subject by different parties, such that other specs, like VCs, can use the ID to refer to an entity who is in presumptive control of the controller document.

IMO, the controller document is about the ID, not about the referent of the ID.

This is a break with the open world data model that many in the work advocate. But the inability to use JSON-LD semantics to make statements about the identifier make this assertion unusable in practice.

iherman commented 2 days ago

The issue was discussed in a meeting on 2024-09-27

View the transcript #### 3.1. privacy review / questions (issue controller-document#93) _See github issue [controller-document#93](https://github.com/w3c/controller-document/issues/93)._ **Brent Zundel:** the issue that was raised is issue 93. … this is a response from Nick Doty of the privacy group, going to have it on the screen for folks to read through. … this is the privacy review, as you read, think about what specific issues need to be raised in order to address the concerns that were brought up. … additionally, what are we going to do to address those issues. … for those of you just joining us, we are looking at issue 93, the PING review of the controller document spec. … a question for the group is, what issues should be raised to track the concerns here, and what are we going to do about them. **Manu Sporny:** I worked on an editor's response to PING, in general PING had good feedback, at a high level there was a concern about the specification being fairly generic, they did not understand the need for something like this. They said it was so abstract it was hard to understand use cases. The use cases here are similar to DID use cases, just with URLs instead of DIDs. … Pointing to the DID use case could help. The review mentioned that it would be a lot of work to profile this document, and it's kind of hard to reason about the privacy properties of the document. We did mention that the document is largely about publishing crypto keys, there is a new PR that adds service descriptions to the document, but we. > *Wesley Smith:* mention that the main reason controller document exists is because vc-jose-cose spec wanted something that did what DID documents did but without DIDs. **Manu Sporny:** we mention that DID core WG is planning on using this document as well. … Largely the response was that the privacy concerns are fairly limited based on the limited set of things that the document is supposed to be doing, as it is largely around crypto key communication, extensions largely out of scope. … they mentioned they were interested in pairwise identifiers, might be good to couple that with W3C language around the origin model of the web, I noted that we do talk about it but could expand the section around identifier-based correlation risks to talk about the Web's origin model. This has been a regular request from groups to talk about the Web's origin based security/privacy model. … They also noted that we say you shouldn't store personal data, but PING suggested that crypto keys are personal data, so a clarification that we meant name, address, etc, rather than crypto material. … Finally, there was a question around the difference between "id" and "controller", we have 2 issues open to discuss that. … A lot of the commentary was editorial in nature, I think that we're going to wait to hear from Nick or PING on exactly what he would like to see done. … We can raise issues on additional language, modification of existing language, etc. I didn't see a request for a fundamental design change in here, will want to clarify that. I did ask about these being largely editorial changes and for PING to let us know if that is not the case. **Manu Sporny:** Nick and the PING will look at this, do another round of comments, and either raise issues or we will raise issues on their behalf. **Ivan Herman:** I must admit, when we started to work on this document, I really had difficulty getting my head around the naming. "Controller Document" does not fit what is there, as the emphasis of this document is to store references to crypto key material and metadata. That's all we are doing. I wonder whether we should rename the document to make it clear what the document is talking about. I know there is a PR on the service. We will come back to this. **Brent Zundel:** A couple more minutes on this topic then will move to another issue. **Joe Andrieu:** Advocating taking the time for bike shedding, at one point this was called something else, it was framing the conversation around something that should be separated into VCs, we already have the idea that controller name is problematic, but we haven't taken the time to fix it. **Manu Sporny:** we have talked about renaming it before and failed to find a better name, I suggested that this was a resource on the web, everyone hated that. … it is expressing key information but could be expressing anything else. **Ivan Herman:** let's not go there. **Manu Sporny:** just saying we had that discussion, in the DID WG we discussed how it did not have service descriptions in it, based on a request by that WG I raised a PR to add service descriptions into controller document, no longer just a bag of keys, now more general tool to engage with the subject. **Michael Jones:** yeah, we have tried to change the name before, I am repeating Manu, but nobody has come up with a better name, people know this name, it is a done deal. **Brent Zundel:** moving to issue 94.
iherman commented 2 days ago

The issue was discussed in a meeting on 2024-09-27

View the transcript #### 3.5. privacy review / questions (issue controller-document#93) _See github issue [controller-document#93](https://github.com/w3c/controller-document/issues/93)._ **Manu Sporny:** I think these are largely editorial, the things that have specific things we could write are editorial, the other things we need PING to respond on, I didn't see any massive design changes we need to make. In some cases they have the same questions we do, e.g. difference between subject and controller. **Brent Zundel:** my recommendation is to reframe them, instead of saying "is this right", say "here is our interpretation of what you said and how we are moving forward, let us know if that is incorrect". … we are operating in good faith and making the best assumptions we can, where we absolutely need a response I can reach out as chair to PING. … for the most part we have a decent idea of what we are looking for. … anything else on 93? **Michael Jones:** sounds like a plan. **Brent Zundel:** next topic, how close are we to CR.