w3c / vc-data-model

W3C Verifiable Credentials v2.0 Specification
https://w3c.github.io/vc-data-model/
Other
299 stars 107 forks source link

Multiple subjects in a single credential #55

Closed msporny closed 6 years ago

msporny commented 7 years ago

Technically, the data model supports a single credential making claims about multiple subjects:

{
  "@context": ["http://schema.org/", "https://w3id.org/identity/v1"],
  "id": "http://example.gov/credentials/3732",
  "type": ["Credential", "MultiSubjectCredential"],
  "issuer": "https://multi.example.org",
  "issued": "2017-06-05",
  "claim": [{
    "id": "did:v1:9f52df2e-7c96-42af-8dfd-1099980f8467",
    "ageOver": 18
  }, {
    "id": "did:v1:7d866a9f-c5a3-41f5-8ae1-0297b7849801",
    "ageOver": 21    
  }, {
    "id": "did:v1:c4e13c30-31dc-40df-867f-ed678d87ac54",
    "ageOver": 65   
  }],
  "signature": {
    "type": "LinkedDataSignature2015",
    "created": "2017-06-05T07:08:36Z",
    "creator": "https://example.com/jdoe/keys/1",
    "signatureValue": "lQ7VZUeAA...Z5Lk="
  }
}

While this may be a corner case, the data model allows for it. Should we change the definition of entity credential and entity profile from being about "a subject" to "usually about the same subject" or equivalent language.

jandrieu commented 7 years ago

Interesting.

I look at that JSON-LD and I'm wondering how anyone knows those dids refer to different subjects. They could all be the same person. And since the statements in each claim isn't actually about the DID, but about the entity referred to by the DID, I think that's more than a semantic distinction.

The only way we could presume that those DIDs referred to the same subject is if that was the rule, which seems like an unnecessary disclosure.

That suggests it would be a mistake to limit a credential to a single subject. However, using "usually" or "typically" is fine to anchor that most of the time it is.

David-Chadwick commented 7 years ago

The data model should definitely make a strong statement that "a credential must contain a set of claims about the same subject". Otherwise it is not a subject profile - it is a group profile.This does not mean that the IDs of the claims must be the same, although that is the simplest way of ensuring this. A subject could have multiple IDs, as in your example above, but must then she must be able to prove to the verifier that she is the owner of all of the IDs.

RieksJ commented 7 years ago

I agree with @David-Chadwick. I would even say that a credential must contain a single (root-)subject-identifier that identifies the (single!) subject in the context of the issuer. If the need were to arise to use other subject-identifiers (which might identify the same, or other entities), then these can be linked to the root-subject(identifier) by means of a (semantically well-defined) predicate.

msporny commented 7 years ago

@David-Chadwick wrote:

The data model should definitely make a strong statement that "a credential must contain a set of claims about the same subject"

-1 ... seems like an arbitrary restriction. The verifiable credential data model doesn't make this restriction (it's capable of containing a set of claims about different subjects). So, why would we want to arbitrarily restrict the data model from doing so? Keep in mind that every time we restrict the data model in this way, we forever cut off a set of potentially useful corner use cases.

@David-Chadwick wrote:

subject profile / group profile

That concept doesn't exist in the spec and I'd argue against introducing the terminology (as it's a corner case and would most likely confuse more than clarify). What we have in the spec right now is "verifiable profile".

@RieksJ wrote:

I would even say that a credential must contain a single (root-)subject-identifier that identifies the (single!) subject in the context of the issuer.

-1, again - why are we arbitrarily drawing the line there? What technical problem is created where we need to limit the data model in this way?

David-Chadwick commented 7 years ago

Manu, I would like to ask you, why would you want a single VC to contain the claims of multiple subjects? Why isn't sending a set of VCs, in which each VC refers to a single subject, good enough for you? In what ways does the latter not solve what you want to solve (actually I don't know what problem you are trying to solve by having multiple subjects inside a single VC. One example might be a club publishing its membership list in a single VC. But I could argue that this isn't really a self-sovereign credential.)

The problem I see is that if the VC contains a set of different IDs and it is not mandated that all the IDs must belong to the same subject, then the IDs may obviously belong to different subjects. And then proof of possession, or right to assert, or to say that these claims are about me, becomes much more difficult to prove. And if the claims are about other people we then get a bunch of data protection issues as well. What right do I have to assert the claims of other people. Did they give me their consent? How can the inspector know that I have the subject's permission etc.

RieksJ commented 7 years ago

Here are some more arguments:

  1. The example that @msporny gives is ambiguous. Assuming that "type": ["Credential", "MultiSubjectCredential"] means that the number of subjects is more than one. However, subjects may be identified with multiple DIDs. Therefore, it cannot be determined whether this credential is about two or three distinct subjects.

  2. But even if that were resolved, consider a credential with claims referring to different subjects. What would happen if the claim(s) pertaining to one subject had to be revoked. Would that revoke the claims of the other subject as well (as the signature is about ALL claims)? I can imagine situations where this has an adverse effect on the reputation of the subject whose claims remain valid but are nevertheless revoked.

  3. If holders can store claims in their wallets/agents, to which holder(s) would issuers provide such claims? Does that align with the privacy principles we want to have?

  4. How can an inspector/verifier that receives such a claim from a users wallet determine which subject it should associate with which claim (which it needs to establish which of the claims it should use)?

  5. If the claims in a single credential are independent, then it is just as easy to create two credentials, one for each subject, thereby not bothering programmers/business people/users to think about the aforementioned issues. If the claims in a singel credential are dependent, then this dependency should be specified by means of a predicate/property.

I would like to see that we specify that if a credential contains multiple claims (of which subject identifiers may differ), the signature over the credential should mean that

ChristopherA commented 7 years ago

Just off my head are a number of cases where multiple subjects might be useful.

— Christopher Allen

RieksJ commented 7 years ago

@ChristopherA, you are right to say that there are several (I would say: numerous) cases where a claim or credential would involve multiple subjects. However, every such case can be accommodated by selecting a single subject(id), and stuffing the relations that it has with the 'other subjects' in properties (and nodes) of the JSON-LD graph that the claim/credential can associate its (single) subject(id) with.

In your car-example, the issuer could then provide a claim/credential to the car, with the car(id) as subject, and a JSON-LD graph car->me->privilege(s), and/or car->privilege(s)->me. The issuer could also provide claims/credentials to me, with a JSON-LD graph: me->car(s)->privilege(s) and/or me->privilege(s)->car(s).

But in any case, the root of the JSON-LD graph, i.e. the subject(id) of the claim/credential should be a single entity.

msporny commented 7 years ago

every such case can be accommodated by selecting a single subject(id), and stuffing the relations that it has with the 'other subjects' in properties (and nodes) of the JSON-LD graph that the claim/credential can associate its (single) subject(id) with.

False, there are cases where there are no such relations between the subject and other subjects (or where creating one would be only done because you need to create a relationship to address the use case).

Example: These 4 devices are in a geolocated area at lat and long with a radius of 100 meters. There is no single subject identifier that links those 4 devices.

RieksJ commented 7 years ago

In that case i suggest the issuer create two credentials for the reasons I mentioned earlier.

Verzonden met mijn Windows Phone


From: Manu Spornymailto:notifications@github.com Sent: ‎25-‎10-‎2017 19:07 To: w3c/vc-data-modelmailto:vc-data-model@noreply.github.com Cc: Joosten, H.J.M. (Rieks)mailto:rieks.joosten@tno.nl; Mentionmailto:mention@noreply.github.com Subject: Re: [w3c/vc-data-model] Multiple subjects in a single credential (#55)

every such case can be accommodated by selecting a single subject(id), and stuffing the relations that it has with the 'other subjects' in properties (and nodes) of the JSON-LD graph that the claim/credential can associate its (single) subject(id) with.

False, there are cases where there are no such relations between the subject and other subjects (or where creating one would be only done because you need to create a relationship to address the use case).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/w3c/vc-data-model/issues/55#issuecomment-339400956, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AIILXU3lP_3ybACL3m1cvSAzXsnNT4Lcks5sv2rmgaJpZM4N19df. This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. TNO accepts no liability for the content of this e-mail, for the manner in which you use it and for damage of any kind resulting from the risks inherent to the electronic transmission of messages.

msporny commented 7 years ago

In that case i suggest the issuer create two credentials for the reasons I mentioned earlier.

... and if there are 100 subjects? 100 different credentials? Why are we attempting to arbitrarily restrict the data model in this way? It could be a best practice, but even that I'm a bit concerned about.

dlongley commented 7 years ago

@RieksJ,

I don't think there's any harm in permitting multiple subjects in the data model and we can't possibly enumerate all possible uses of VCs. Therefore, it seems a simple solution would be to say that there's no restriction in the data model, but some systems may require a single root subject in the VCs they are willing to accept.

Similarly, a best practices document could indicate how to avoid using multiple subjects for some broad set of use cases.

RieksJ commented 7 years ago

@msporny: for the 5 reasons I mentioned 2 days ago. But if you think they are invalid, It's ok with me to let it happen, and see how things work out.

msporny commented 7 years ago

The example that @msporny gives is ambiguous. Assuming that "type": ["Credential", "MultiSubjectCredential"] means that the number of subjects is more than one. However, subjects may be identified with multiple DIDs. Therefore, it cannot be determined whether this credential is about two or three distinct subjects.

Yes, but what's the use case where it's important whether a credential is about two or three distinct subjects. To put it another way, the enforcement of that is at a higher policy level, not at the data model layer.

But even if that were resolved, consider a credential with claims referring to different subjects. What would happen if the claim(s) pertaining to one subject had to be revoked. Would that revoke the claims of the other subject as well (as the signature is about ALL claims)? I can imagine situations where this has an adverse effect on the reputation of the subject whose claims remain valid but are nevertheless revoked.

Yes, all claims would be revoked about all subjects because credentials are typically either valid or not... but again, this is a policy decision, not a data model decision.

If holders can store claims in their wallets/agents, to which holder(s) would issuers provide such claims? Does that align with the privacy principles we want to have?

Yes, issuers will provide claims to holders that are not about them. For example, the medication for my pet is about my pet, but I hold on to it.

How can an inspector/verifier that receives such a claim from a users wallet determine which subject it should associate with which claim (which it needs to establish which of the claims it should use)?

This is a part of the protocol (proof of possession), which is higher level than the data model.

If the claims in a single credential are independent, then it is just as easy to create two credentials, one for each subject, thereby not bothering programmers/business people/users to think about the aforementioned issues. If the claims in a singel credential are dependent, then this dependency should be specified by means of a predicate/property.

There are cases where the single credential doesn't work (there is no relationship between A and B, but the issuer would like to bundle them anyway - e.g. the 100s of devices geolocated IoT use case above).

msporny commented 7 years ago

if you think they are invalid, It's ok with me to let it happen, and see how things work out.

I'm not saying what you are asserting is invalid. I'm saying that those decisions should be made at a higher protocol or policy layer, not at the data model layer.

David-Chadwick commented 7 years ago

The best example of a multi-subject VC that I know is a marriage certificate. This has two equal subjects. So if we are to allow for multi-subject VCs, then I would suggest that the data model has a specific section on this topic, and there is a specific keyword to indicate this, which is set by the issuer. This field tells the verifier the difference between a multi-subject VC and a single subject VC where the subject has multiple IDs. I suggest we introduce the keyword "singlesubject" whose default value is 'true' and is missing from the majority of VCs. The keyword must be present in multi-subject VCs with the value set to 'false', e.g.

"type": "Credential", "singlesubject": "false", "issuer": "https://multi.example.org",

I dont think it is sufficient to rely on an optional qualification of the type field as in Manu's original example.

ChristopherA commented 7 years ago

I think @David-Chadwick's example of a married couple is the best example I've seen so far, especially as the "unordered" significance of the the list of subjects is important. There are also web-of-trust assertions like this, for instance according to http://schema.org/Person the "knows" attribute should be bi-directional — thus two subjects need to be in a single claim. Doing "knows" as two claims forces the higher level layer to find the other in order to validate the bi-directionality. (and yes, I know that "knows" commonly isn't used that way, but that was its original intent in the old FOAF days where this value was inherited from. "follows" is the unidirectional association.)

Still not sure that we need to require "multisubject=true" or "singlesubject=false" in such cases.

David-Chadwick commented 6 years ago

The latest specification (30 Jan 2018) is somewhat ambiguous about this topic, and ambiguity is not a good property of a strandard as it leads to different interpretations and different implementations.

Some text supports a single subject and some text supports multiple subjects. Specifically text about a credential supports a single subject as in section 2 (credential - A set of one or more claims made by the same entity about a subject) and section 6.4 (The claim subject identifier must match expectations.) These are quite unambiguous. There can only be a single subject ID in a claim, not multiple IDs, and all subject IDs must be about the same subject (thought the IDs do not need to be identical).

Profile text implies there can be multiple subjects, as section 2 (profile - A set of one or more credentials typically related to the same subject. )

Conclusion. Remove ambiguity and delete the word "typically" from the definition of profile.

David-Chadwick commented 6 years ago

It is really important to nail this one down, and I dont think we have so far. Can I suggest this be added as a topic of discussion at one of the WG weekly telecons.

RieksJ commented 6 years ago

I agree with @David-Chadwick. See https://github.com/w3c/vc-data-model/issues/80#issuecomment-366884497.

talltree commented 6 years ago

What about a birth certificate? Doesn't it describe multiple subjects (baby, mother, father, doctor)?

I seem to be coming across many examples of credentials that have multiple subjects. Each credential has exactly one holder (of any particular instance of the credential), but some of them have more than one subject.

=Drummond

On Mon, Feb 19, 2018 at 11:01 PM, Rieks notifications@github.com wrote:

I agree with @David-Chadwick https://github.com/david-chadwick. See #80 (comment) https://github.com/w3c/vc-data-model/issues/80#issuecomment-366884497.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/55#issuecomment-366884770, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTZHdxn5IMf09ox_utZSpgfJeLTXLks5tWm3jgaJpZM4N19df .

RieksJ commented 6 years ago

Yes it does. As does a marriage certificate.

There is this though: representing such a certificate in a credential requires one of its subjects to be selected as the root subject of the claim(tree) in the credential. If the baby-subject were selected, you would have

<credential>
  babyid 'has father' fatherid
  babyid 'had mother' motherid
</credential>

If the father were selected, you would have

<credential>
  fatherid 'is father of' babyid
                          babyid 'has mother' motherid
</credential>

Note that in the latter credential, the statement about the mother is not necessarily a separate claim, but can link from the babyid as specified in the statement 'is father of', in which case the credential still has a single root subject (fatherid).

This shows that it is possible to have multiple subjects in a claim, but that this in itself is an insufficient argument for requiring different root-subjects in a credential to refer to different real-world entities.

I would say that a strong argument for having different root-subjects in a credential can be made based on knowledge theory, but that a prerequisite for this is agreement about the ecosystem model. I've written down some ponderings on that in a separate document Ponderings on a SSIF.pdf, on which I sollicit feedback.

David-Chadwick commented 6 years ago

@talltree. I disagree with you about a birth certificate. This is surely the credential of the baby, and this person will use it throughout their life to claim certain other privileges. The properties of this credential are name, date of birth, father, mother etc. There is nothing to stop this credential being held by different people: father, mother, child etc. The holder can be identified in the ID of the profile. But the credential subject remains the same in all.

talltree commented 6 years ago

David, while I agree that you can create a verifiable credential for a birth certificate that has a single subject, in the discussions we at Evernym have been having with the Illinois Blockchain Initiative about our pilot project with the to deliver digital birth certificates in the State of Illinois, they are preferring to issue a digital birth certificate where the child, the mother, and the father are all subjects.

In our discussions with them, they have also brought up marriage certificates, where by definition there is more than one subject (at least more than one subject who is a person).

The more we at Evernym work with verifiable claims, the more we are seeing multi-subject use cases, so I hope that will not be precluded by the WG.

=Drummond

On Tue, Feb 20, 2018 at 1:30 AM, David Chadwick notifications@github.com wrote:

@talltree https://github.com/talltree. I disagree with you about a birth certificate. This is surely the credential of the baby, and this person will use it throughout their life to claim certain other privileges. The properties of this credential are name, date of birth, father, mother etc. There is nothing to stop this credential being held by different people: father, mother, child etc. The holder can be identified in the ID of the profile. But the credential subject remains the same in all.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/55#issuecomment-366918961, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTelSaSWe8Vh0ciy2E7qN5Tk7BRJMks5tWpCigaJpZM4N19df .

David-Chadwick commented 6 years ago

Hi Drummond I think it is more important that the data model clearly and unambiguously specifies which types of credentials and profiles it does support (i.e. single/multi subject credential, subject EQ or NE Holder, single/multi subject profile) and enables the verifier to be told what type it is receiving, rather than, as in the current document, being vague and/or ambiguous and/or contradictory about these issues. I think a lot of the current outstanding issues are more to do with reader A inferring one interpretation whilst reader B infers the opposite one. If the data model specification nails down clearly how each type is indicated, this will remove any ambiguity, and then people can more easily say whether we should remove this feature, or add this other feature. Personally I am less concerned about whether a VC can contain multiple subjects or not, than whether the verifier is told by the issuer whether it does or not. The fact that the IDs in two claims can be different does not tell me that. So how is the verifier supposed to know? Especially if the holder is not (any of) the subject(s), then you just have a bunch of potentially anonymous IDs.

jandrieu commented 6 years ago

I mostly agree with David on this, although I split the difference about the issuer stating the subject explicitly in any manner other than they already are.

Rather, I think what we have been using for the definition of "subject" is flawed. The idea that somehow a group of claims needs to be assigned to one or more "subjects" is spurious. All three can trivially be subjects of specific claims in the credential. As might be the attending physician and other sundry witnesses.

That said, the profile needs to allow the holder to specify the relationship between the holder and subjects in the credential. Given a profile and its embedded credentials, it should be immediately apparent to the verifier how to interpret what the holder is claiming wrt the credentials.

Fortunately, this is almost ALREADY be supported by the data model. All the holder has to do is issue a credential claiming the relationship(s) and include that credential in the profile.

I believe this will address all of the challenging use cases David has brought up.

Consider a birth certificate with three subjects can be presented by any of those subjects (as holder). By adding relationship claims, the holder can unambiguously assert a statement such as "I am the child in that certificate". This can be as simple as claiming the id of the profile is the same person as the child id in the certificate. Either parent could present analogous statements based on that same VC.

It also addresses the need to unambiguously state that the holder is the same person as the multiple distinct IDs in credentials from two different issuers. The holder simply includes a claim to that effect.

Finally, it also allows unambiguous statements relating ANY of the credentials IDs.

Consider an individual wishing to claim US citizenship. They have their birth certificate (they were born in Kenya) and their mother's US passport. These two, independent credentials are bundled together in a profile. What we desire is for the data model to allow an explicit statement that the mother in the birth certificate is the same individual as the citizen in the US passport. The verifier would then be able to understand the asserted relationship and would be able to inspect the related information in each credential to see if they match, for example, checking that the names on the two credentials are the same.

If we want to make this truly a sticky wicket: the woman was unwed at birth and married later, so that the passport has a different last name. Then a verifier would likely want to see a marriage certificate, which could provide evidence of the name change. All three of these Verifiable Credentials (birth cert, marriage cert, and passport) need to be tied together so the verifier knows what is being claimed by the holder. We can do that today, but we don't have a mechanism to state the relationship between them.

If we let go of the idea that a credential is about some sort of "subject" distinct from the subjects in the claims, then we can address David's desired rigor by making sure the holder can self-assert their own credential correlating identifiers and relationships.

I like this sticky wicket of claiming citizenship as it highlights the ephemeral nature of real-world identifiers alongside multiple inflections of subject != holder. It also makes it clear that the notion of a simple "relationship" property isn't going to be enough.

The only trick, which I think is solvable, is that we need the holder's claim to be able to refer to a specific identifier in a specific included claim. I don't think that the current data model allows us to say: ID "X" in VC "Y" in this profile and I couldn't find any language requiring that the identifiers used in claims be globally unique. My understanding is that issuers can use any identifier they want.

@msporny or @dlongley, other than simply sticking all these VCs into a profile or adding a self-issued credential as I propose, is there something I'm missing from the current data model that would allow the holder to assert (a) they are the child on the birth cert (b) the mother on the birth cert is the US citizen in the passport, and (c) the mother is the bride in the marriage cert?

David-Chadwick commented 6 years ago

Joe I really like your post :-) Concerning the global uniqueness of IDs, if issuers can use any identifier they want, then how is an issuer identified globally? And how does a verifier disambiguate between two issuers who possibly with the same ID? Wont the distributed ledger ensure that no two clients can have the same ID? Concerning the issuer making a statement about the subject, what I am proposing is that an optional parameter (call it 'multipleSubjects') be mandatory added to a credential that contains different subject IDs that the issuer knows do not belong to the same subject. So credentials with multiple claims about the same subject would not change from the current data model. But a marriage or birth certificate would contain this property if it were structured with multiple subjects. But if they were structured with a single subject and claims about the other ones (e.g. father is X, spouse is C), then multipleSubjects would not be present.

dlongley commented 6 years ago

@jandrieu,

Thumbs up to your post.

other than simply sticking all these VCs into a profile or adding a self-issued credential as I propose, is there something I'm missing from the current data model that would allow the holder to assert (a) they are the child on the birth cert (b) the mother on the birth cert is the US citizen in the passport, and (c) the mother is the bride in the marriage cert?

I think what you've proposed is a good approach but since you asked I will offer this as well so that everyone is aware of other possibilities:

The data model is designed in a composable way. Here this means that you could think of a Profile as containing the same kind of information that could appear as the RDF object of a claim in a VerifiableCredential. In other words, a Profile contains any set of relationships about an RDF subject -- which in this case, is the Holder. We expect, in protocol, the Holder to present the Profile to a Verifier as, effectively, their own claim. That claim (a Profile here) includes their own identifier, the verifiable credentials they claim to hold, whatever statements they want to make about themselves, and their own cryptographic proof on all of that.

// a Profile
{
  '@context': 'https://w3id.org/vc/v1',
  id: 'urn:holderId',
  // holder asserts they are the child in the birth certificate
  sameAs: 'urn:childId',
  mother: {
    // holder asserts their mother is the same as the one
    // in the birth certificate
    id: 'urn:motherId',
    // holder asserts the mother is the bride and the US citizen
    // in the marriage certificate and the US citizenship ID, respectively
    sameAs: ['urn:brideId', 'urn:citizenId']
  },
  verifiableCredential: [{
    // birth certificate
    id: 'urn:birthCertificateId',
    issuer: 'urn:issuerA',
    claim: {
      id: 'urn:childId',
      mother: 'urn:motherId',
      father: 'urn:fatherId',
      // ...
    }
  }, {
    // marriage certificate
    id: 'urn:marriageCertificateId',
    issuer: 'urn:issuerB',
    claim: {
      id: 'urn:fatherId',
      bride: 'urn:brideId'
      // ...
    }
  }, {
    // citizenship ID
    id: 'urn:citizenId',
    issuer: 'urn:issuerC',
    claim: {
      id: 'urn:citizenId',
      citizenOf: 'US'
    }
  }],
  proof: {
    type: 'Ed25519Signature2018',
    // ...
  }
}

Now, none of this says anything about why the Verifier would trust the Holder's assertions. And this would likely be done somewhat differently with attribute-based-credentials, where the identifiers are not revealed, but rather a zero-knowledge proof of possession of a correlation secret would be presented. So there's more to consider with that technology/privacy approach because this use case may be ill-suited for it. There's an implicit assumption, I believe, under that approach, that the holder and subject are more intricately linked in ways that may make sharing like this more complex.

jandrieu commented 6 years ago

@David-Chadwick regarding identifiers, I agree that the id of an issuer must be globally unique, but I didn't see that stated in the model. Maybe I missed it.

However, I don't think there's been a strong assertion that other identifiers, such as for credentials and subjects, are independently globally unique. Again, maybe I missed it in the model, but I expect it is an undocumented expectation. It is possible that the identifiers only be unique within the surrounding context, e.g., unique within the issuer's namespace, implicitly allowing "local" ids that are scoped by the containing context with unambiguous expansion to a globally unique id.
For terseness, I would favor a syntax that allows both locally scoped and globally unique identifiers, but I expect the JSON-LD folks already have some thoughts on this. @msporny @dlongley ?

FWIW, what I'm describing could also be implemented with Linked Local Names, a concept that I believe already has 3 submissions for the upcoming RWOT. @ChristopherA might comment on that.

RieksJ commented 6 years ago

While I can live with the notion of 'globally unique identifier' because of the practical use that it has, in theory there is no such thing. Every identifier, even the 'globally unique' ones, come with a scope within which this uniqueness is characteristic. After all, it is possible for me to pick any (existing) URI/URN and decide that for me, it will refer to the computer I'm typing this message on. Anyone that doesn't take scoping/contexts into account is then left with an ambiguous, non-unique identifier.

It would be good to address this explicitly in the data model. The profile example of @dlongley could be annotated to describe for each identifier what the scopes are within which it can be dereferenced. I expect to see the need of some 'common scope' (the 'global' scope?) that allows us to exchange identifiers between parties in such a way that each is capable of dereferencing it to the same entity. And as other scopes are being made explicit, I expect to see a clear description of the issues regarding identifier interpretation and ways to resolve them.

jandrieu commented 6 years ago

@dlongley In your birth certificate example, where the id of the profile is the id of the holder.

{
  '@context': 'https://w3id.org/vc/v1',
  id: 'urn:holderId',
  // holder asserts they are the child in the birth certificate
  sameAs: 'urn:childId',

So... what is the ID of the profile? How do I refer to the profile canonically, instead of referring to the HolderID.

dlongley commented 6 years ago

@jandrieu,

The "profile" is just some statements about the holder. We expect profiles to be fairly ephemeral documents that just make use of the "default graph". But I suppose if someone wanted to reference a particular profile they'd use the "graph" position in an RDF quad, i.e. you'd say "this is what was said about the holder in a particular document/graph named profile id.

So you could do something like this in JSON-LD:

{
  "profile": {
    "id": "<profile ID>",
    "@graph": {
      "id": "<holder ID>",
      ...
      }
    }
  }
}

But we don't have a use case for that nor have we defined such a "profile" predicate in our vocabularies. Maybe exploring this might help us finally come up with a better name for "Profile", like "HolderProfile" or simply "Holder". "Profile" has always been the best worst name we've come up with for the concept.

stonematt commented 6 years ago

There is an audit argument where the verifier will need to return to the origination documents that were the basis of the verifications. The verifier won't think of the profile as ephemeral - it is the collection of legal documents used to grant or deny the benefit. Granting/Denying access must be legally defensible, if it's ever included in discovery for a subpoena .

jandrieu commented 6 years ago

Profiles are only going to be ephemeral on the holder's side. For GDPR and purpose binding and such, the verifier would do well to retain the profile as proof of the source of the information and their rights in usage.

This is an echo of the RDF triples v quads debate. The profile is itself a context for evaluating the statements made within. Consumers of those statements will need to refer to the profile in some consistent way.

talltree commented 6 years ago

Dave, I agree that "profile" is the weakest term in the entire VC vocabulary (issuer, holder, and verifier have all proved to be stellar terms IMHO).

I can see the temptation to use "holder" as the outer envelope really does represent the full package the holder presents the verifier. But for that same reason, I think the term will be confused with the role of the actual holder.

Can I suggest a very utilitarian term like "credential envelope"? There are of course various ways to shorten it ("credelope"? ;-), but for now it seems like an accurate name for what it actually does.

On Mon, Feb 26, 2018 at 3:21 PM, Dave Longley notifications@github.com wrote:

@jandrieu https://github.com/jandrieu,

The "profile" is just some statements about the holder. We expect profiles to be fairly ephemeral documents that just make use of the "default graph". But I suppose if someone wanted to reference a particular profile they'd use the "graph" position in an RDF quad, i.e. you'd say "this is what was said about the holder in a particular document/graph named profile id".

So you could do something like this in JSON-LD:

{ "profile": { "id": "", "@graph": { "id": "", ... } } } }

But we don't have a use case for that nor have we defined such a "profile" predicate in our vocabularies. Maybe exploring this might help us finally come up with a better name for "Profile", like "HolderProfile" or simply "Holder". "Profile" has always been the best worst name we've come up with for the concept.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/55#issuecomment-368687003, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTRQkEw0Jmhah0fED3X1DNPQ42ji-ks5tYzxfgaJpZM4N19df .

RieksJ commented 6 years ago

Why would a profile need to be ephemeral - or better perhaps: what would its life-expectancy roughly be? Is it a matter of minutes, or can you have a profile lasting for years - perhaps the better part of your life?

Why would a profile only need to contain statements about the holder? Can it not, for example, contain statements about an entity for which the holder exercises guardianship? Or about an entity to which it is married?

Here's a figure that may help you understand my (quite conceptual) way of thinking: integrationpatternrelations

The idea is to explicitly separate the (human) 'business' domain from the electronic domain. I roughly use the following terminology:

Rules that must hold:

Having said all this, I realize that we (I) may be drifting off-topic. Should we start a new thread for this?

talltree commented 6 years ago

Rieks, I like this very much, it maps quite precisely to how agents operate in Hyperledger Indy and Sovrin architecture.

On Mon, Feb 26, 2018 at 11:35 PM, Rieks notifications@github.com wrote:

Why would a profile need to be ephemeral - or better perhaps: what would its life-expectancy roughly be? Is it a matter of minutes, or can you have a profile lasting for years - perhaps the better part of your life?

Why would a profile only need to contain statements about the holder? Can it not, for example, contain statements about an entity for which the holder exercises guardianship? Or about an entity to which it is married?

Here's a figure that may help you understand my (quite conceptual) way of thinking: [image: integrationpatternrelations] https://user-images.githubusercontent.com/8522589/36716033-29552f2c-1b99-11e8-9e45-bd6385014701.png

The idea is to explicitly separate the (human) 'business' domain from the electronic domain. I roughly use the following terminology:

  • Party: an entity (= something that exists, a basic ontological notion) that is capable of (1) making up its own mind (reasoning) and (2) making decisions (outcome of a reasoning). The idea is that human beings, organizations, government(al agencie)s are parties. Software/hardware is not. And yes, there's some grey areas that need to be attended to at a later stage.
  • Knowledge: a set of (usually intangible) representations of what (entities) a Party knows to exist, as well as classifications of that entities, relations between such entities, and rules for knowing what must be true, may not be true etc. Knowledge is 'the mind' of a Party; when a Party makes up its mind, it is adding to or reorganizing (any part of) its Knowledge.
  • Profile a set of data that represents Knowledge. Note that this does not require the Profile to be only associated to holders. We can (and will) associate it to other roles as well. See also the rules below.
  • Electronic Actor a hardware and/or software component that is running on a computing device. Examples include apps (on a mobile phone), web applications (on web servers), 'things' etc.
  • Agent an Electronic Actor that represents precisely one Party.

Rules that must hold:

  • all multiplicity rules as shown in the figure
  • Every Agent uses one Profile that represents a subset of the Knowledge of the Party that the Agent represents. Note that this means that the profile used by an electronic agent that represents the issuing party can be (stored in) the identity-register application that this party uses. Similarly, the profile used by an electronic agent that represents a party in the verifier role, may well be (stored in) web-server applications or other back-end systems.
  • every agent that that engages in an electronic business transaction with another agent must be capable of establishing the party that the other agent represents to the extent as determined by the party that it represents.

Having said all this, I realize that we (I) may be drifting off-topic. Should we start a new thread for this?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/55#issuecomment-368773979, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTSVAO9zpit2EvRGydn-CuJ5ZOYSHks5tY7BfgaJpZM4N19df .

Drabiv commented 6 years ago

Drummond, I like the idea to change "profile" to "envelope" (or maybe "bundle"?). Profile term used as a collection of credentials is confusing to me.

RieksJ commented 6 years ago

@talltree: does your liking this imply

RieksJ commented 6 years ago

From where I stand, a profile holds credentials that hold statements that represent part of the holder's knowledge, which in turn can be represented as a populated (knowledge) graph. I like to see the profile as an annotated, populated knowledge graph. Many tools exist for working with such graphs, e.g. SPARQL or SHACL. One concept would thus be this knowledge graph.

A verifier wants to query the graph for a subset of its data, for the purpose of deciding whether or not to deliver some service. If I'm not mistaken, Attribute Based Credentials (ABCs) use the term 'presentation token' for the response that a holder's agent sends to the verifier. This response is not just a subset of the data, but may also contain derived data (e.g. one may derive an 18+ claim from one's birthdate and today's date). Another concept would thus be the presentation token (or envelope?)

What then remains is that a holder may have multiple, possibly conflicting, realities (e.g. what is true in the context of ones friends may not be true in the context of ones job). A profile should contain a single, consistent and coherent reality of one holder. An inconsistent reality of that same holder should be in a different profile. This requires that the holder, when requesting the verifier some service, to preselect the profile that the holder will be allowing the verifier to query. That shouldn't be too big an issue, as a profile would correspond with a context that the holder would actually know (having defined it himself).

msporny commented 6 years ago

We're way off topic for this issue. The original issue was about making it clear that a single Verifiable Credential can express information about multiple subjects. I don't think we will find consensus for restricting the core data model where a Verifiable Credential can only be about one subject.

Verifiable Profiles are a completely different topic, let's please have the discussion on Verifiable Profiles / Envelope's there:

https://github.com/w3c/vc-data-model/issues/107

I'll suggest a PR for this specific issue in the next hour.

talltree commented 6 years ago

On Tue, Feb 27, 2018 at 1:27 AM, Rieks notifications@github.com wrote:

@talltree https://github.com/talltree: does your liking this imply

  • that profiles can have multiple subjects?
  • that profiles are not necessarily ephemeral?
  • that profiles are not just there for holders, but also for issuers and verifiers?

Rieks, yes, I think all of these. I just don't know how to deal with the issue of scope.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/w3c/vc-data-model/issues/55#issuecomment-368805653, or mute the thread https://github.com/notifications/unsubscribe-auth/ADLkTbM7jMHzciY5IE0aP9TouJ_bViJcks5tY8pogaJpZM4N19df .

jandrieu commented 6 years ago

Based on https://github.com/w3c/vc-data-model/issues/55#issuecomment-368687003 I am creating a new issue #133 to address the problem of the ID of the profile being the identifier of the holder.

msporny commented 6 years ago

The original issue asked this question:

Should we change the definition of entity credential and entity profile from being about "a subject" to "usually about the same subject" or equivalent language.

I suggest we close this issue by creating a PR that states that "a credential is a set of attestations made about one or more subjects"... and "a profile contains one or more credentials".

This is both technically accurate and the discussion above only raised a single benefit with many downsides wrt. limiting a credential or profile to be about a single subject. The only upside we gain by limiting a credential or profile to only refer to a single subject is that it may be easier to understand for newcomers but at the expense of greatly limiting the expressibility of the data model.