suggestion - make `id` property optional in presentation_definition and input_descriptors

Sakurann commented 3 years ago

There are currently two id properties in PE

id - The Presentation Definition MUST contain an id property.
The Input Descriptor Object MUST contain an id property.

I am sure some use-cases require these ids, but not all: both ids are not required when there is only one presentation_definition or only one input_descriptor. Implementations should be able to decide when to include these ids, which has been a direction in OIDC4VP draft discussion.

cc: @alenhorvat

csuwildcat commented 3 years ago

The top-level IDs are debatable, but features within the spec are targeted/assessed by-reference to input descriptors, so the input descriptor IDs can't be removed. Also, it's generally a bad practice to create properties that are transient, if it can be avoided. (more code branching = more bugs)

tplooker commented 3 years ago

While I generally agree less optionality in a spec usually leads to more interoperable implementations. In this particular case it depends on the context in which a structure like an input descriptor is likely to be used and whether all applications would require an id property. With OpenID connect there are applications where the input descriptor object (or an array of these) is likely to be embedded as a new request parameter. Since OpenID already has mechanisms to uniquely identify the request to the parties involved in the protocol, having to stipulate an id at the input descriptor level would be surplus to requirements and likely create confusion in that application of the spec.

I would suggest relaxing the constraint around this field as being required is a good idea for structures like input descriptors. However, language should be added that says if you are using other P.E structures that rely on being able to address an input descriptor, then they should make use of the id property.

As an example of the sort of model I think we are heading towards, JWT's chose to make all claims optional (https://datatracker.ietf.org/doc/html/rfc7519#section-4.1) because their application as a token format is vast. However by at least having the claims registered and defined, they ensure there is an interoperable way in which common token functions can be performed, such as uniquely identifying tokens (https://datatracker.ietf.org/doc/html/rfc7519#section-4.1.7).

In general I expect application protocols of P.E such as OpenID or DIDComm to profile their usage (e.g say things like the id property in input descriptors is required OR not used).

csuwildcat commented 3 years ago

@tplooker I'm frustrated this ticket has commingled two completely different types of IDs in the spec. I've taken time in my reply to distinguish between the two, so can we please do so going forward?

The top level IDs are certainly something that could be relaxed (I didn't push for those, just FYI), but the IDs in each Descriptor array member are not something we can remove, because the spec includes features that rely on referencing members (and would lead to needless hit test loops, vs O(1) direct evaluation). Can everyone here please acknowledge this disambiguation and that there's an understanding of what the per-member IDs are used for?

tplooker commented 3 years ago

@csuwildcat understand, I think we have agreement on the top level id's being optional it is the definition of the id property for input descriptors that we are discussing.

I understand that in certain contexts when an input descriptor is used in conjunction with other P.E defined structures that having all input descriptors uniquely identifiable via an id is a necessity. However, I dont think that means the id property in an input descriptor has to always be required. For instance would the following text suffice.

"When an input descriptor object is being used with a context that requires it to be uniquely identifiable the property of id SHOULD be used". Whereby the cases you speak of would meet that criteria.

csuwildcat commented 3 years ago

@tplooker on the top-level PE Definition ID, I think we can relax that. As for the PE Descriptor items, I am very weary of a change like the one you describe above because I personally feel it is an introduction of a line that basically says "You read the tea leaves and figure it out, and hopefully you include it when you need to", all for shaving a few bytes? Would it be so hard for the SIOP spec to just specify that those IDs be integer string values? I would like to hear what folks like @brentzundel @JaceHensley and @OR13 think about it.

tplooker commented 3 years ago

@tplooker on the top-level PE Definition ID, I think we can relax that.

Ok great.

As for the PE Descriptor items, I am very weary of a change like the one you describe above because I personally feel it is an introduction of a line that basically says "You read the tea leaves and figure it out, and hopefully you include it when you need to", all for shaving a few bytes

Ok what about making the language I suggested stricter, for instance when you are making use of the submission_requirements structure, you could say that all input descriptors MUST use the id property to uniquely identify which is the basis of how they are referenced?

csuwildcat commented 3 years ago

@tplooker there are other feature properties within the Descriptors themselves that require referential relationships (e.g. same_subject, wherein two input descriptors may be linked via the requirement that the subject of each is the same) , and I suspect they're will be more. For this reason, I'd really like to avoid adding the subjectivity + a one-off Note that says: "And you can branch your code here to create an uneven experience across transports, but if you need these 3 features, don't, and it's all defined somewhere else what of this will be observed". Also, does SIOP really want to process these submissions in hit test loops that burn needless CPU cycles O(n), and/or not support features like same_subject?

tplooker commented 3 years ago

@csuwildcat, ok I understand that there are other consumption patterns than the example I just gave, however I think the language remains true.

Also, does SIOP really want to process these submissions in hit test loops that burn needless CPU cycles O(n), and/or not support features like same_subject?

Im not saying that, I would say when it is beneficial to application protocols, they should normatively define use of the property in a way that leads to interoperable implementations and performant processing. However what I am arguing is not all consumptions of P.E need to or benefit from the id property in an input descriptor being mandatory.

R.e the particular application of the same_subject property in the context of OIDC, again i dont think there is an application for this particular property, because the protocol assumes that an OIDC request-response is always referring to one logical subject (the end-user) so the case will never arise to need to express this in a request (as there is no other option).

brentzundel commented 3 years ago

Relaxing the requirements for the input descriptor id property don't make sense to me, for all of the reasons already outlined by @csuwildcat . The most the presentation definition id property should be relaxed is from MUST to SHOULD, imo.

JaceHensley commented 3 years ago

I agree that input descriptors need to have their id. At the very least they are required for identifying which VC in the Presentation Submission corresponds to which Input Descriptor in the PD. I'm not sure how else you could map the VC in the PS to the Input Descriptor in the PD

tplooker commented 3 years ago

Ok just to clarify, the ask was only to transition this property from a MUST to a SHOULD so that when using the input_descriptors object model in a context like OIDC, where there is zero need for the id property we wouldn't be having to generate a random id just to meet conformance to the spec. I'm not arguing the need for this field in other applications of P.E does not exist, just that the need is not universal.

David-Chadwick commented 3 years ago

@JaceHensley

At the very least they are required for identifying which VC in the Presentation Submission corresponds to which Input Descriptor in the PD. I'm not sure how else you could map the VC in the PS to the Input Descriptor in the PD

This mapping might be required in your policy matching algorithm, but we have our own way of matching the presented VCs against the RP's policy, and it does not require input descriptor IDs in order to do this. So the requirement is not universal

JaceHensley commented 3 years ago

How does that matching happen?

csuwildcat commented 3 years ago

How does that matching happen?

Yes, I am also super interested in how the Verifying server is ingesting N submitted things and verifying them against, for example the PE Definition filters that determine credential content validity. Are you honestly just looping each candidate input O(n) and seeing if a green check shakes out somewhere? While I suppose this is possible (if you aren't using features in the spec that absolutely require explicit references), why would you want to create a system of needless looping and blind hit testing, @David-Chadwick?

David-Chadwick commented 3 years ago

I am not going to tell your our algorithm. Sorry. That is our IPR.

tplooker commented 3 years ago

In our application with OIDC, there just is simply no need to identify each input descriptor individually, hence the id property is surplus to requirements.

JaceHensley commented 3 years ago

@tplooker can you expand on that more? Why do you not need to identity the VCs provided to the verifier, wouldn't the verifier need to know how to parse each of them to get the data needed from the VC?

tplooker commented 3 years ago

@tplooker can you expand on that more? Why do you not need to identity the VCs provided to the verifier, wouldn't the verifier need to know how to parse each of them to get the data needed from the VC?

Sure, just to be clear, the context in which we want to use P.E is an OIDC request/response, whether that be an issuance or presentation flow. Loosely looks like this when drawn as ASCI art.

Screen Shot 2021-07-20 at 8 48 27 AM

Where the RP is either a holder or a verifier and the OP is either a issuer or holder, depending on whether the request is an issuance or verification flow.

In either case basically the relying party is saying "Hey im requesting some credentials about the end user" which is communicated by an array of P.E input descriptor objects that features in the OIDC request as a new request parameter. There is no need for each input descriptor object to have an id as the provider is simply going to respond with an array of credentials in the response or nothing (because it could not fulfil the request).

csuwildcat commented 3 years ago

I am not going to tell your our algorithm. Sorry. That is our IPR.

David, clearly you're misunderstanding something, and it likely confirmed what I thought may have been the case: PE Definitions are not just for the Holder to filter credentials on the client, they're also for the Verifier to evaluate the contents. I am not talking about any super cool 'algorithms' you have (whatever that means), I am talking about the Verifier using the very same PE Definition filters, which is completely standard and open. PE was designed a bit like a form validation lib, in that the JSON Schema and other filter mechanisms, which are completely standard and open, work on both the client (Holder) and the server (Verifier).

agropper commented 3 years ago

This diagram is oversimplified because it does not show the separate clients and authorization server. The End User in this flow is always the Resource Owner (RO) and Subject. The RP request could be per the OAuth RAR spec and the OP box could be separate AS and RS.

Adrian

On Mon, Jul 19, 2021 at 4:54 PM Tobias Looker @.***> wrote:

@tplooker https://github.com/tplooker can you expand on that more? Why do you not need to identity the VCs provided to the verifier, wouldn't the verifier need to know how to parse each of them to get the data needed from the VC?

Sure, just to be clear, the context in which we want to use P.E is an OIDC request/response, whether that be an issuance or presentation flow. Loosely looks like this when drawn as ASCI art.

[image: Screen Shot 2021-07-20 at 8 48 27 AM] https://user-images.githubusercontent.com/15972525/126225495-b041d6f8-f047-4b4f-9550-502e77c8d1b4.png

Where the RP is either a holder or a verifier and the OP is either a issuer or holder, depending on whether the request is an issuance or verification flow.

In either case basically the relying party is saying "Hey im requesting some credentials about the end user" which is communicated by an array of P.E input descriptor objects that features in the OIDC request as a new request parameter. There is no need for each input descriptor object to have an id as the provider is simply going to respond with an array of credentials in the response or nothing (because it could not fulfil the request).

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/decentralized-identity/presentation-exchange/issues/231#issuecomment-882851887, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABB4YM66H73T56VX6RJ5YLTYSGGZANCNFSM5AKOHLNQ .

David-Chadwick commented 3 years ago

@csuwildcat Its exactly as you say. We use the same algorithm in the holder and the verifier.

csuwildcat commented 3 years ago

@csuwildcat Its exactly as you say. We use the same algorithm in the holder and the verifier.

Surely you realize that PE makes evaluation standard and open source, utilizing JSON Schema, JSON Path, and explicit steps for processing that are part of the spec. Are you saying you are opposed to PE because it standardized much of the credential content evaluation, and you'd rather that stay proprietary to retain your perceived advantage from leaving evaluation subjective/userland?

JaceHensley commented 3 years ago

There is no need for each input descriptor object to have an id as the provider is simply going to respond with an array of credentials in the response or nothing (because it could not fulfil the request).

Okay I see what you're saying, but wouldn't the RP still want to know which VC is which, or are you saying the RP doesn't use or care about the individual VCs and just that the user had VCs to share? Even if the RP doesn't care about the individual VCs they would still care that the provided VCs do in fact satisfy the PD, that validation would have to be done on the RP's side.

csuwildcat commented 3 years ago

@tplooker are you honestly saying you just trust clientside input without running the submitted creds back against the PE Definition on the Verifier server?

tplooker commented 3 years ago

Im unsure what you mean by the notion of trust here? Do you mean does the provider trust the relying party to make a sound request?

tplooker commented 3 years ago

Okay I see what you're saying, but wouldn't the RP still want to know which VC is which, or are you saying the RP doesn't use or care about the individual VCs and just that the user had VCs to share? Even if the RP doesn't care about the individual VCs they would still care that the provided VCs do in fact satisfy the PD, that validation would have to be done on the RP's side.

Im saying since the response to the request is either all credentials requested or none, there is no real need for uniquely identifying each input descriptor and linking it to the credential that is received in response.

JaceHensley commented 3 years ago

How does the RP know that the Presentation Submission satisfies the Presentation Definition? It seems like the response from OP to RP could contain an array of arbitrary VCs that don't necessarily satisfy the PD and the RP wouldn't be able to verify this themselves

csuwildcat commented 3 years ago

@tplooker yes, as @JaceHensley said, Verifiers need to take the credentials submitted and rerun the PE Definition processing validation steps on them after they receive them, as you would any other submitted inputs from any other client/server system, else you're not actually filtering out bad submissions. Surely you know what we mean here, right?

David-Chadwick commented 3 years ago

The verifier only needs to know that the entire VP satisfies the RP's policy, or it does not. there is no half way house of almost satisfying the policy. It's a yes/no decision. @csuwildcat I am not opposed to PE because it is standardised. On the contrary, I strongly support standards and have co-authored many different ones. I am opposed to certain unnecessary features in PE which I think should either be removed (like the How instead of the What, which we have discussed many times) or made optional (like the id property).

csuwildcat commented 3 years ago

@David-Chadwick are you saying you oppose PE sending filter information that defines what content a credential must include (e.g. the prop account.balance must be > 10000), and if so, why? Pretty much everyone using PE is using those features to limit the dumb yolo'ing of destined-to-fail credential submissions (resulting in unnecessary data sharing). To evaluate those filters and limitations efficiently, the mapping IDs allow the Verifier to know which definitions they are testing a given input against.

It's reaching a point where two things need to happen: 1) do a call to yet again walk people through this and help correct misunderstandings, 2) make a decision as to whether the spec will be used. Remember, PE can be used in these things regardless, so if the ask is 0-compromise teardown of a spec/features people actively use, there's always that option.

David-Chadwick commented 3 years ago

@csuwildcat Not at all. I don't know how you deduced this from what I said. We have use cases in which the RP specifies not only the types of VCs that are needed, and the trusted issuers, but also the properties and values that they need to contain. These are all semantic features of what is required. Kind regards.

csuwildcat commented 3 years ago

@David-Chadwick so to be clear: you're not opposing the inclusion of things like the JSON Schema-based filter for ensuring a credential contains the correct values, but you are pushing to remove the IDs from the input descriptors and submission map that allow the Verifier to efficiently run the submitted credentials back against the filters? This seems at odds with itself.

David-Chadwick commented 3 years ago

@csuwildcat Actually there has already been a comment suggesting that VCs should be selected based on their type and not on their schema. So if you are proposing to use schema as the only way of selecting a VC then yes I am opposed to this. My opinion is that managers are more concerned about the type of a VC than its schema components e.g. the schema.org Address is used by many different types of VCs, and not all types will be acceptable. Once a VC has been received and validated for trustworthiness, then clearly it needs to be checked that it matches its defined schemas, so I am not opposed to this.

csuwildcat commented 3 years ago

@David-Chadwick "So if you are proposing to use schema as the only way of selecting a VC then yes I am opposed to this. My opinion is that managers are more concerned about the type of a VC than its schema components" - this response from you again seems to indicate that you have not read, or are sufficiently familiar with, the Presentation Exchange spec:

^ We're not talking about JOSN Schema-based identification of the whole credential, it's in relation to using JSON Schema syntax to test certain values inside the credential, regardless of the actual credential's schema, to ensure the values the credentials contains are what the Verifier needs. It is referring to the fields portion of the spec, which tests fields in the target object to check if their values meet the requirement, a requirement test that is expressed in JSON Schema:

"fields": [
  {
    "path": ["$.credentialSubject.birth_date", "$.vc.credentialSubject.birth_date", "$.birth_date"],
    "filter": {
      "type": "string",
      "format": "date",
      "minimum": "1999-05-16"
     }
   }
]

^ this example is basically saying "Find the birth_date property in the credential submitted, then use this JSON Schema filter to test if the value of the property meets the requirement of the Verifier". This is where the IDs come in: how do I know which of many credentials submitted do I test if I don't have IDs for the descriptors that they can be submitted and tested against.

JaceHensley commented 3 years ago

Say a verifier has the following Presentation Definition:

{
  "id": "32f54163-7166-48f1-93d8-ff217bdb0653",
  "input_descriptors": [
    {
      "id": "email_input",
      "schema": [{ "uri": "..." }],
      "group": ["A"],
      "constraints": {
        "fields": [
          {
            "path": ["$.type"],
            "purpose": "Credential must be of type EmailCredential",
            "filter": {
              "type": "array",
              "items": { "type": "string" },
              "contains": { "const": "EmailCredential" }
            }
          }
        ]
      }
    },
    {
      "id": "phone_input",
      "schema": [{ "uri": "..." }],
      "group": ["A"],
      "constraints": {
        "fields": [
          {
            "path": ["$.type"],
            "purpose": "Credential must be of type PhoneCredential",
            "filter": {
              "type": "array",
              "items": { "type": "string" },
              "contains": { "const": "PhoneCredential" }
            }
          }
        ]
      }
    },
    {
      "id": "address_input",
      "schema": [{ "uri": "..." }],
      "group": ["A"],
      "constraints": {
        "fields": [
          {
            "path": ["$.type"],
            "purpose": "Credential must be of type AddressCredential",
            "filter": {
              "type": "array",
              "items": { "type": "string" },
              "contains": { "const": "AddressCredentialPersonV1" }
            }
          }
        ]
      }
    }
  ],
  "submission_requirements": [
    {
      "name": "Contact Information",
      "purpose": "We need you to provide a single piece of contact information.",
      "rule": "pick",
      "count": 1,
      "from": "A"
    }
  ]
}

Essentially asking the user for one piece of contact information but it could be email, phone, or address. The user then sends a Presentation Submission back to the verifier. It would look something like this (example is a VP but that doesn't really matter):

{
  "@context": ["https://www.w3.org/2018/credentials/v1", "https://identity.foundation/presentation-exchange/submission/v1"],
  "type": ["VerifiablePresentation", "PresentationSubmission"],
  "id": "...",
  "presentation_submission": {
    "id": "...",
    "definition_id": "32f54163-7166-48f1-93d8-ff217bdb0653",
    "descriptor_map": [
      {
        "id": "phone_input",
        "format": "ldp_vc",
        "path": "$.verifiableCredential[0]"
      }
    ]
  },
  "verifiableCredential": [
    {
      "@context": ["https://www.w3.org/2018/credentials/v1", "..."],
      "type": ["VerifiableCredential", "PhoneCredential"],
      "credentialSubject": {
        "telephone": "+1 555 555 1234"
      },
      "proof": {
        // ...
      }
    }
  ],
  "proof": {
    // ...
  }
}

The verifier can then run this PresentationSubmission, along with the Presentation Definition through a validation function. That validation function would pull out the VCs from the descriptor map and map them back to their Input Descriptor and validate that it does satisfy the Input Descriptor (schema, constraints.field, etc.). If the verifier didn't run the PresentationSubmission through validation on their end then the user could submit something like this and the verifier wouldn't know:

{
  "@context": ["https://www.w3.org/2018/credentials/v1", "https://identity.foundation/presentation-exchange/submission/v1"],
  "type": ["VerifiablePresentation", "PresentationSubmission"],
  "id": "...",
  "presentation_submission": {
    "id": "...",
    "definition_id": "32f54163-7166-48f1-93d8-ff217bdb0653",
    "descriptor_map": [
      {
        "id": "phone_input",
        "format": "ldp_vc",
        "path": "$.verifiableCredential[0]"
      }
    ]
  },
  "verifiableCredential": [
    {
      "@context": ["https://www.w3.org/2018/credentials/v1", "..."],
      "type": ["VerifiableCredential", "IDDocumentCredential"],
      "credentialSubject": {
        "name": "John Smith"
      },
      "proof": {
        // ...
      }
    }
  ],
  "proof": {
    // ...
  }
}

Similarly if Input Descriptors did not have ids then the verifier wouldn't be able to efficiently validate the PresentationSubmission:

{
  "@context": ["https://www.w3.org/2018/credentials/v1", "https://identity.foundation/presentation-exchange/submission/v1"],
  "type": ["VerifiablePresentation", "PresentationSubmission"],
  "id": "...",
  "presentation_submission": {
    "id": "...",
    "definition_id": "32f54163-7166-48f1-93d8-ff217bdb0653",
    "descriptor_map": [
      {
        "format": "ldp_vc",
        "path": "$.verifiableCredential[0]"
      }
    ]
  },
  "verifiableCredential": [
    {
      "@context": ["https://www.w3.org/2018/credentials/v1", "..."],
      "type": ["VerifiableCredential", "PhoneCredential"],
      "credentialSubject": {
        "telephone": "+1 555 555 1234"
      },
      "proof": {
        // ...
      }
    }
  ],
  "proof": {
    // ...
  }
}

All the verifier knows is that one VC exists at $.verifiableCredential[0], but they don't know which Input Descriptor it's supposed to be for. So that would make the verifier go through and test the VC against all the Input Descriptors until they find one that the VC satisfies.

If there is a way for the verifier to know which VC corresponds to which Input Descriptor I am all ears but I cannot think of a way to do that.

csuwildcat commented 3 years ago

Yes, as @JaceHensley correctly notes: performing the serverside validation becomes a strange O(n) looping hit test when you remove Descriptor/Submission IDs, vs direct tests against creds submitted, by ID reference, against their intended Descriptor. There is no reason to degrade this straightforward reference/evaluation mechanism; I assure you, any developer who looks at a set of steps that requires a random O(n) looping hit test would laugh and wonder "Why didn't these people just reference which Descriptors each credential is being submitted against so I can write code that is not overly complicated and doesn't burn needless CPU cycles?"

JaceHensley commented 3 years ago

I've actually found myself wanting an id on each submission_requirement object haha

csuwildcat commented 3 years ago

@JaceHensley yeah, you could reduce some looping if you did that, but that looping is less costly than doing all the constraint tests on the Descriptors, so that's why I was pragmatic and left off IDs for those during the design phase.

David-Chadwick commented 3 years ago

We have different mental models of how the verifier should work. In the model we use the verifier (manager) specifies the VC type(s) and trusted issuers and optional filters plus the selective disclosure rules. The wallet matches the requirements against its stored credentials and returns the selectively disclosed VCs to the verifier. The verifier checks that the VP matches its policy and only then, verifies if the VCs match their defined schemas as contained in their credentialSchema properties.

JaceHensley commented 3 years ago

The verifier checks that the VP matches its policy

how does this work without knowing which VC is for which Input Descriptor? How does it know that the VP satisfies the PD? This isn't any special algorithm this is the PE spec

verifies if the VCs match their defined schemas as contained in their credentialSchema properties.

But what if that VC.credentialSchema isn't what the PD is asking for? If the verifier is asking for a PhoneCredential but I give an EmailCredential the credentialSchema on the EmailCredential would validate against the Credential just fine. But it wouldn't match what the verifier is actually asking for

David-Chadwick commented 3 years ago

VC.credentialSchema isn't what the PD is asking for

The verifier never asks for a schema. Its asks for a VC Type. This is the significant difference in our mental models.

Thus matching the VCs against the policy is done on types and issuers and is relatively straightforward. In the example that @JaceHensley gave, the verifier wants an email address from EmailCredential or a telephone number from a PhoneCredential or an address from AddressCredentialPersonV1 and the user returns a single VC of a specific type. Thus it is simple to match the type and find if the policy is met. One you have a matched VC you can check if its schema is correct. The weakness in @JaceHensley's policy is that the verifier has not stated who the trusted issuers are, so the user does not know if the VC he returns will be acceptable or not, even if it does match the schema and VC type

JaceHensley commented 3 years ago

The verifier never asks for a schema. Its asks for a VC Type. This is the significant difference in our mental models.

The specifics of what the verifier asks for doesn't matter really.

Thus it is simple to match the type and find if the policy is met. One you have a matched VC you can check if its schema is correct.

Yeah in this example asking for a single VC is easy but it gets much more expensive when there's more. How are you matching the provided VC to that policy? In my mind you'd have to iterate over the input_descriptors to find the item that matches with the provided VC, no?

const vc = verifiableCredential[i]
// iterate and test all until you find a match
const descriptorIndex = input_descriptors.findIndex(descriptor => vcMatchesDescriptor(descriptor, vc))
if (descriptorIndex < 0) throw new Error('could not find Input Descriptor matching provided VC')

vs

const descriptorMapping = descriptor_map[i]
const vc = getVC(submission, descriptorMapping) // uses descriptorMapping.path to get the VC
const descriptor = input_descriptors.find(descriptor => descriptor.id === descriptorMapping.id)
// only have to match against a single descriptor
const matches = vcMatchesDescriptor(descriptor, vc)
if (!matches) throw new Error('invalid VC for descriptor')

The weakness in @JaceHensley's policy is that the verifier has not stated who the trusted issuers are

Sure, that can be added but adding a requirement about an issuer is functionally the same as any other requirement.

csuwildcat commented 3 years ago

I really would like people to stop talking about types and schemas of the overarching credential. The main use of JSON Schema filters in the PE spec is for the Holder to evaluate whether the credential contents meaning the actual values fit a set of constraints, and for the Verifier to do the exact same serverside verification once they receive the credentials. To do this, you need to know which descriptors to run a credential against on the Verifier side, which would require nonsensical looping without IDs.

JaceHensley commented 3 years ago

^^ +1 to @csuwildcat I got a little to into the weeds with VC schema vs VC types, that was just the example I had handy.

Let's use a simpler example.

This is our PD, it is requesting a VC that has the property foo.bar.baz and it must be 42 or greater. Literally doesn't care about the schema or type of the VC, just wants one with foo.bar.baz of 42 or greater.

{
  "id": "32f54163-7166-48f1-93d8-ff217bdb0653",
  "input_descriptors": [
    {
      "id": "foobarbaz_input",
      "schema": [{ "uri": "..." }],
      "group": ["A"],
      "constraints": {
        "fields": [
          {
            "path": ["$.foo.bar.baz"],
            "purpose": "Credential must have property foo.bar.baz and it's value must be greater than 42",
            "filter": {
              "type": "number",
              "minimum": 42
            }
          }
        ]
      }
    }
  ]
}

The holder, if they are a good actor, will use the PD to filter their VCs to select a VC that satisfies the InputDescriptor. But if the holder is a bad actor they'd just submit whatever VC they want.

The PresentationSubmission would like this this:

{
  "@context": ["https://www.w3.org/2018/credentials/v1", "https://identity.foundation/presentation-exchange/submission/v1"],
  "type": ["VerifiablePresentation", "PresentationSubmission"],
  "id": "...",
  "presentation_submission": {
    "id": "...",
    "definition_id": "32f54163-7166-48f1-93d8-ff217bdb0653",
    "descriptor_map": [
      {
        "id": "foobarbaz_input",
        "format": "ldp_vc",
        "path": "$.verifiableCredential[0]"
      }
    ]
  },
  "verifiableCredential": [
    {
      "@context": ["https://www.w3.org/2018/credentials/v1", "..."],
      "type": ["VerifiableCredential", "..."],
      "credentialSubject": {
        // ...
      },
      "foo": {
        "bar": {
          "baz": 42
        }
      },
      "proof": {
        // ...
      }
    }
  ],
  "proof": {
    // ...
  }
}

The verifier would then want to run this PresentationSubmission (and thus it's VCs) against the PD to ensure that the holder did in fact submit a VC that satisfies the PD.

csuwildcat commented 3 years ago

This thread is driving me nuts, so in the hope that I can boil this down, I am going to use an example:

BUT FIRST --> Forget about the overarching credential type/schema - this is an absolute canard that somehow has latched onto the minds of people here. Put all those thoughts in the trash. I will pause as you purge it from your mind.

Narrator: he paused

OK, let's begin:

You have a PE Descriptor that says: "Within your submitted credential (of the 3 you submit), you need to have a credential that matches requirements X, Y, Z, which includes having a number greater than 42 at the foo.bar.baz property"
The Holder runs through their credentials and tests them to see if there are any matches for the PE Input Descriptors the Verifier provided, one of which is a filter for the foo.bar.baz property value, which just happens to be described using the JSON Schema conditional syntax.
The Holder finds a cred that meets all the requirements, including a value of 75 for the foo.bar.baz property, which is greater than 42. Yay!
Now the Holder sends their bundle of 3 creds to the Verifier, who also needs to do the exact same verification of requirements against the submitted creds
The Verifier needs to make sure they cred they received contains a value greater than 42 for the foo.bar.baz property, because you can't just trust client inputs. (form data submission 101)
The Verifier uses the exact same PE Input Descriptor filter against the cred to determine the cred does indeed contain a value at foo.bar.baz that is greater than 42...but wait, which cred do they test that against?! Good question. Shucks, wouldn't it be neat if the credentials were submitted with an ID to let the Verifier know which PE Input Descriptor a cred is being submitted against, so they know which one to test against? Heck yes it would! Well great, they they're in luck, because that's exactly what the Descriptor/mapping ID does! If you didn't specify the PE Input Descriptor ID against which you are submitting a credential, you would instead have to O(n) loop to hit test all PE Input Descriptors against all the submitted creds, which every programmer reading the spec would laugh and point at. I don't want people laughing and pointing at us, thus we have references that connect each PE Input Descriptor to each credential submitted.

David-Chadwick commented 3 years ago

@csuwildcat Yes I agree that this is an efficiency step that you are adding so that you don't have to cycle through the VCs.

selfissued commented 3 years ago

On the 22-Jul-21 OpenID SIOP Special Call, people pointed out that several existing Presentation Exchange implementations already don’t use the id field. Therefore, it seems like including this should be optional.

csuwildcat commented 3 years ago

@selfissued I think you are conflating the top-level single PE Definition ID with the actual Input Descriptor IDs, which are two completely different things this thread has woefully commingled. The top-level ID is something most folks in the community think can move to a SHOULD, but the PE Input Descriptor IDs are essential for the entire processing of the submission, and there are no implementations that don't use them, to my knowledge.

At this point, I really want to move to close this Issue, as the commingling of the two different type of IDs has lead to a thread 352452323453245 messages long where people are simply unable to distinguish between them.

csuwildcat commented 3 years ago

I am going to close and lock this thread to push us over to the following two Issues where these two very different types of IDs are broken out, as the confusion this Issue's description caused is clouding our discussions:

decentralized-identity / presentation-exchange

suggestion - make `id` property optional in presentation_definition and input_descriptors #231