Consider SADifying Selective Disclosure Blinded Attribute Array

trustoverip / tswg-acdc-specification-archived

Authentic Chained Data Containers (ACDC)

Other

3 stars 4 forks source link

Consider SADifying Selective Disclosure Blinded Attribute Array #78

Open jasoncolburne opened 1 year ago

jasoncolburne commented 1 year ago

The ACDC specification defines selective disclosure, amongst other mechanisms, to facilitate graduated disclosure and chain link confidentiality. It is defined by the spec to work like this:

Create unblinded array of anonymized SADs containing attribute details. Create array of SAIDs from SADs. Create digest by concatenating all SAIDs and hashing.

The first (1) looks like:

"A": [
  {
    "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
    "u": "0AB9VADfPtCQvFqp-u4BxUvy",
    "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY"
  },
  {
    "d": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
    "u": "0ACsCxwKKCg0C9Hb7OX9ajbZ",
    "legalName": "Jason Colburne"
  },
  {
    "d": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA",
    "u": "0ACbHEmnqeXUJFMf1G2Dj0BU",
    "age": 43
  }
]

The second becomes:

"A": [
  "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
  "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA"
]

And the third:

"A": "EGsHbUlJ1JSm63A2dzBmrWvLcKVb22_6OD0fL61KuZ3V"

The way it works, is that you put (3) in the ACDC, and it proves a commitment to (2). You then show the ACDC schema to interested parties, and give them (2) plus some elements from (1). The elements from (2) are committed to, and they provide a further commitment to the disclosed (some) elements from (1). This allows one to share some, but not all, blinded attributes of an ACDC.

The spec suggests a schema like this to provide a verifier the info they need to understand what information is contained in the ACDC:

    "A": {
      "oneOf": [
        {
          "description": "Attribute aggregate digest",
          "type": "string"
        },
        {
          "$id": "",
          "description": "Attribute aggregate array",
          "type": "array",
          "minItems": 1,
          "maxItems": 3,
          "uniqueItems": true,
          "items": {
            "anyOf": [
              {
                "type": "object",
                "required": ["d", "u", "i"],
                "properties": {
                  "d": {
                    "description": "SAID of disclosable data",
                    "type": "string"
                  },
                  "u": {
                    "description": "Salty nonce",
                    "type": "string"
                  },
                  "i": {
                    "description": "Issuee AID",
                    "type": "string"    
                  }
                },
                "additionalProperties": false
              },
              {
                "type": "object",
                "required": ["d", "u", "legalName"],
                "properties": {
                  "d": {
                    "description": "SAID of disclosable data",
                    "type": "string"
                  },
                  "u": {
                    "description": "Salty nonce",
                    "type": "string"
                  },
                  "legalName": {
                    "description": "Legal name",
                    "type": "string"    
                  }
                },
                "additionalProperties": false
              },
              {
                "type": "object",
                "required": ["d", "u", "age"],
                "properties": {
                  "d": {
                    "description": "SAID of disclosable data",
                    "type": "string"
                  },
                  "u": {
                    "description": "Salty nonce",
                    "type": "string"
                  },
                  "age": {
                    "description": "Age",
                    "type": "number"
                  }
                },
                "additionalProperties": false
              }
            ]
          }
        }
      ]
    }

Here's the problem. uniqueItems thinks objects with the same keys but different values are different, and allows them. This lets a malicious issuer construct this data in a valid ACDC:

"A": [
  {
    "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
    "u": "0AB9VADfPtCQvFqp-u4BxUvy",
    "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY"
  },
  {
    "d": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
    "u": "0ACsCxwKKCg0C9Hb7OX9ajbZ",
    "legalName": "Jason Colburne"
  },
  {
    "d": "EB7_uP-FZ8aoErbInx6BZTZmDb0A5QuFhIQCROlStoMF",
    "u": "0ADrpl6F7jzxNWPe0K_cqIFj",
    "legalName": "Harry Potter"
  }
]

Now a colluding issuee can present differing information to disjoint sets of participants for the same attribute keys.

A solution: saidify a key/value object of SAIDs rather than using a list. So for (2), use this SAD:

"A": {
  "d": "EAVyu2rAKhx-kV0sVOKMDE4IFHQjmMyllmZfInBygwb0",
  "i": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "legalName": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
  "age": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA"
}

And then (3) becomes:

"A": "EAVyu2rAKhx-kV0sVOKMDE4IFHQjmMyllmZfInBygwb0"

This permits a check to ensure that someone can use a maximum of one value per key.

Does this make sense?

jasoncolburne commented 1 year ago

Another solution to this may be to try and correlate the order of the blinded list with the attribute definitions in the anyOf clause in the schema - but I dislike that, as it introduces a dependency on the schema for more than just json-schema validation.

jasoncolburne commented 1 year ago

Even if the schema disallowed objects with the same keys and different values, I think there would be a problem since the data isn't necessarily disclosed together (the schema can still be validated during disclosure).

SmithSamuelM commented 1 year ago

@jasoncolburne I must admit I don't understand the attack. I think it might be due to a difference in assumptions about how a verifier verifies any presentation of the ACDC.

So I will state my assumptions and see if that is the source of my misunderstanding.

A verifier does three verifications.

a semantic structure verification using JSON Schema
a cryptographic structure verification using nested hashes (SAIDs) i.e. a hash tree
a authenticity verification of a signature of the issuer on the top level or root hash of the hash tree which hash tree includes not only A but the SAID of the JSON Schema.

If any of the three fail then the verification fails. A verifier will never accept as valid a presentation that does not satisfy all three verification. There may be additional verifications of committments by the presenter of a selective disclosure based on a graduate disclosure where the presenter signs additional commitments. Maybe this is where I misunderstand. But I think you are talking about the Issuer's commitments and verifying those.

Using your labeling

3) is the value of A as a string which is the hash of the concatenation of the list of hashes in 2) where each hash is the SAID of the blocks in 1).

In order for the verifier to verify any selective disclosure presentation of any combination of blocks drawn from 1, the presenter must provide the list of hashes (SAIDs) in 2). This enables the verifier to recompute 3). So if even one of the hashes (SAIDs) provided by the presenter is different (or if the order of SAIDs is different) then the verifier will compute a different A and the presentation fails.

In your example the SAID of the third block in the malicious ACDC is different from the SAID in the third block of the valid ACDC. So when the verifier recomputes A it will fail. What am I missing here?

I believe that where source of ambiguity is a limitation of JSON Schema's uniqueItems property, which as you point out would allow multiple copies of the same sub-schema but the uniqueItems is applied in combination with the anyOf. I believe that one may not repeat subschema from the anyOf list of subschema. Admittedly the documentation of anyOf does not make this clear, but it is implied and all the examples I could find imply non-repetition. Your example repeats the subschema with the legalName subschema. It would be good to verify this property of anyOf.

Notwithstanding an anyOf that allows repeated subschema, the computation of A forces that each block from 1) have the same SAID, so the verification would fail nonetheless.

What am I missing?

SmithSamuelM commented 1 year ago

The proof process for selective disclosure is outlined in detail in section 13.3.3 of the ACDC Spec. the relevant language is hoisted below:

Given that aggregate value A appears as the compact value of the top-level attribute section, A, field, the selective disclosure of the attribute at index j may be proven to the disclosee with four items of information. These are:

The actual detailed disclosed attribute block itself (at index j) with all its fields.
The list of all attribute block digests, [a₀, a₁, ...., a_N-1] that includes a_j.
The ACDC in compact form with selectively-disclosable attribute section, A, field value set to aggregate A.
The signature(s), s, of the Issuee on the ACDC's top-level SAID, d, field or equivalently a seal of the ACDC's top-level SAID anchored in an event in the KEL of the issuer (which event is signed by the Issuer) or equivalently a seal of the transaction event log entry anchored in an event in the KEL of the issuer which event log entry, in turn, includes the ACDC's top-level SAID.

The actual detailed disclosed attribute block is only disclosed after the disclosee has agreed to the terms of the rules section. Therefore, in the event the potential disclosee declines to accept the terms of disclosure, then a presentation of the compact version of the ACDC and/or the list of attribute digests, [a₀, a₁, ...., a_N-1]. does not provide any point of correlation to any of the attribute values themselves. The attributes of block j are hidden by a_j and the list of attribute digests [a₀, a₁, ...., a_N-1] is hidden by the aggregate A. The partial disclosure needed to enable chain-link confidentiality does not leak any of the selectively disclosable details.

The disclosee may then verify the disclosure by:

computing a_j on the selectively disclosed attribute block details.
confirming that the computed a_j appears in the provided list [a₀, a₁, ...., a_N-1].
computing A from the provided list [a₀, a₁, ...., a_N-1].
confirming that the computed A matches the value, A, of the selectively-disclosable attribute section, A, field value in the provided ACDC.
computing the top-level SAID, d, field of the provided ACDC.
confirming the presence of the issuance seal digest in the Issuer's KEL
confirming that the issuance seal digest in the Issuer's KEL is bound to the ACDC top-level SAID, d, field either directly or indirectly through a TEL registry entry.
verifying the provided signature(s) of the Issuee on the provided top-level SAID, d field value.

The last 3 steps that culminate with verifying the signature(s) require determining the key state of the Issuer at the time of issuance, this may require additional verification steps as per the KERI, PTEL, and CESR-Proof protocols.

henkvancann commented 1 year ago

@jasoncolburne, just for the better understanding: Are we talking about an instance of a double-commitment scheme, which you consider an attack?

A commitment scheme allows one to commit to a chosen value while keeping it hidden to others, with the ability to reveal the chosen value later. These are usually characterized by two phases: the commit phase and the reveal phase.

If an issuer makes two claims and chooses to reveal one at a later time, this seems to resemble a form of double-commitment scheme, ...

SmithSamuelM commented 1 year ago

The purpose of the schema is not to verify the cryptographic structure of the presented data, but merely, the semantic structure. It is possible by putting SAIDs as constants into the schema itself to sort of use schema validation to validate some of the cryptographic structure but this may make unreasonable assumptions about how JSON scheming tooling works or will work in the future.

Indeed the cryptographic structure validation protects the semantic validation because even if a malicious JSON schema were to validate on its own, it only materially matters if the malicious schema validates in concert with the cryptographic structure validation. So what we care about for an attack is a presentation where the cryptographic structure validation passes but with a malicious schema that also passes not a malicious schema that passes but the cryptographic structure validation fails.

Separating the concerns of semantic validation from cryptographic validation and then designing it so that the stronger cryptographic validation protects the weaker semantic one follows a principle I first learned from Schnierer's book "practical Crytography"

Design Rule 1. Complexity is the worst enemy of security.
Design Rule 2. Correctness must be a local property

Admittedly there is a lot of aethestic judgement in how to apply such rules. But in this case I think of it this way. If I have a security measure that is already computationally infeasible to attack, then bolstering it with some other security measure that is not protecting from a different type of attack is unnecessary complexity and it makes the correctness of the security of the first measure non-local (both are needed).

Its like bolting on a wood 2x4 to a steel ibeam. The 2x4 may add some negligible amount of extra strength under load but it does not add enough to justify the complexity of bolting it on. If, on the other hand, the bolt on 2x4's purpose is prettier painted wood trim meant to hide the ugliness of the steel ibeam then any extral structural strength the trim imbues is purely a side-effect. Instead the greater stength of the the ibeam will protect the structural integrity of the pretty trim (not the other way round). So attempting to use schema validation (2x4) to protect structural validation (ibeam) is using schema validation for the wrong purpose.

Moreover, layering weaker security measures on top of stronger ones runs the risk that one might rely on the weak measure in corner conditions that then justify weakening the stronger measure which then may backfire in some other corner condition where the two are not mutually protecting. I have fallen for that temptation more than once.

ACDCs depend on a strong non-repudiable proof via digital signature of issuance, which signature makes a cryptographic commitment to the strong cryptographic digest of the field map structure of the ACDC, which includes a strong commitment to the digest of the schema which schema makes a comparatively weak commitment to the semantics of the field structure but that semantic commitment is protected by the stronger field structure proof not the other way around.

dhh1128 commented 1 year ago

@jasoncolburne , I see how the situation you constructed allows mischief. But IIUC isn't this a shortcoming of the bad schema design? You assume a schema that allows a credential to have more than one legal name; naturally this would then allow a malicious issuer to put more than one legal name into the cred (presumably colluding with an issuee who later wants to disclose one name but not the other). But is such a schema reasonable?

As @SmithSamuelM points out, the verifier should be confirming that the schema matches their requirements -- and unless the requirements in this case are about knowing at least one of the issuee's legal names for some strange reason (e.g., what was it either before or after a name change), it seems like this schema should be rejected, without creating any additional behavior in ACDCs to prevent it.

jasoncolburne commented 1 year ago

In your example the SAID of the third block in the malicious ACDC is different from the SAID in the third block of the valid ACDC. So when the verifier recomputes A it will fail. What am I missing here?

I am saying that the colluding issuer issues not the first blinded array, but one corresponding to the double entry. Here is the array the issuer would use to compute the top level digest in my malicious example:

"A": [
  "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
  "EB7_uP-FZ8aoErbInx6BZTZmDb0A5QuFhIQCROlStoMF"
]

A verifier that only requests the name and issuee AID (and not the age) has no way that I can see to ensure that a SAID in the blinded array corresponds to a given label in the unblinded attribute array, only that it is in the array or not. For example, how do we know which SAID in the blinded array corresponds to age and which to legalName?

I understand that a malicious issuer can issue untrue data at any time, and it's really no different than that, but I feel like simply adding labels prevents the more concerning situation of retaining identifier reputation by presenting valid data until the critical moment when an attack can be mounted from arising entirely?

The schema I provided was the example in the spec, and the anyOf permits duplication. Unfortunately, I don't see a way to provide one that allows a verifier to verify that the unblinded array actually meets the criteria when all elements are included, if they only request some elements, or if the presenter refuses to present all attributes. Correct me If I'm wrong.

In addition to avoiding this mischievious case, SADifying that array would allow me to store it without creating an edge case.

In a comment in slack it was mentioned I may be describing Partial Disclosure but when I re-read the section seemed to indicate I was talking about Selective. I just want to be clear that I'm talking about the case where some attributes are unblinded and others remain hidden.

I've also been trying to frame this in the typical example but in my world it is highly likely that self-issuance will occur which raises the likelihood of this happening.

SmithSamuelM commented 1 year ago

@jasoncolburne I have to think about this some more and see if I can see where you are coming from or to suggest the appropriate measure. If feels like to me that the selective disclosure might not be the appropriate mechanism for your use case. That better is to make the claims granular one or more closely related claims per acdc that do not need to be selectively disclosed and then just use full disclosure. Then they are already Saidified. Historically selective disclosure is a way to granularize VC ified legacy credentials like drivers licenses that mix functional attributes with forensic attributes such as authenticators and so need selective disclosure to separate out the forensic attributes from the functional ones. But where all your use cases are self-issued or individualized ACDCs there is no need to mix functional attributes with forensic ones. They can each reside in their own VC and be “effectively” selectively disclosed by being chained together when they need to be disclosed and not chained with they don’t and then the chained ones are partially disclosed in order get contractual protection and then fully disclosed and the unchained ones never show up.

SmithSamuelM commented 1 year ago

@jasoncolburne I am trying to understand the exploit but some things you said are confusing to me. When you say:

For example, how do we know which SAID in the blinded array corresponds to age and which to legalName?

Clearly when a block is actually presented (disclosed) the label of the field inside the block will either be age or legalName. So the verifier will be able to verify that a given disclosed block corresponds to a specific SAID in the blinded array. So what the other undisclosed blocks hold is immaterial to the verifier. Either the verifier gets a verified block with the labels they need or it doesn’t. So I frankly don’t understand the attack.

The purpose of selective disclosure is not to expose any correlatable information about the undisclosed attributes. The labels themselves in concert with the associated blinded SAID provide correlatable information. Its no longer fully blinded. Not being able to match up a given schema block to a given blinded SAID is a feature.

So as far as I can tell, the exploit is that a given Issuer can issue a set of two or more blocks, where each block in the set shares the same subschema in that the field labels are the same but the values of each block in the set are different. A schema for such a set can hide the fact that there is a such a set by not including in the schema repeated identical subschema. However any presentation in order to verify must use one of the blocks in the set that were used to create the blinded array in the first place. Is this a correct understanding of what your are saying is the attack? So what the issuer is hiding is that there is such a set of repeated blocks in the blinded array.

From the standpoint of why selective disclosure has been used historically in VCs, such an array of undetectable repeated blocks (for example using a ZKP for selective disclosure) would be a feature not a fault. The presenter gets to decide which block to present. Since all are valid issuances they are all valid presentations. The problem as I see it, is that a self-issuer or colluding malicious issuer may choose to make conflicting, inconsistent, statements about the issuee, and there is no way for the verifier to know that a given ACDC that uses selective disclosure is indeed making conflicting statements.

This is always true for selective disclosure mechanisms that truly make the attributes selectively disclosed uncorrelatable with the attributes not selectively disclosed. That an issuer could issue conflicting or contractory attributes, i.e. duplicitous in a selectively disclosable format is always true. The only way for a verifier to prove that there are no duplicitous attributes is to correlate all the attributes. Making the labels correlatable without disclosing the values is still correlation. But then if we want to make the labels correlatable, then we are in a situation where selective disclosure is no longer selective and we have a use case that contra-indicates the use of selective disclosure. Better, as I suggested above to use granular ACDCs where all attributes must be disclosed and use partial disclosure to hide those values until after contractual protection is in place.

By themselves an ACDC cannot ensure the veracity of the data so conveyed. We must trust the issuer. If the Issuer does not have any counter-incentive to issuing inaccurate or unfounded or duplicitous statements, the ACDC itself does not prevent that. Only correlation across multiple presentations could expose duplicitous issuances by that issuer. So selective disclosure from an untrustable issuer is an anti-pattern.

SmithSamuelM commented 1 year ago

@jasoncolburne I think what you want to use instead of selective disclosure is a two-level partial disclosure with nested oneOf s in the schema. One of the problems I faced is that the term “selective disclosure” has a well known technical meaning in the SSI community that is different from the english language definition. So I picked the term partial disclosure to indicate a different “selectively disclosure” mechanism. Nested partial disclosure enables disclosing field labels without disclosing field values. I can provide an example to be clear.

SmithSamuelM commented 1 year ago

Let me suggest this approach:

The fully disclosed data is structure as follows:

Full


"a": 
{
  "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "u": "0AB9VADfPtCQvFqp-u4BxUvy",
 "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY",
  "LegalName":
  {
    "d": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
    "u": "0ACsCxwKKCg0C9Hb7OX9ajbZ",
    "value": "Jason Colburne"
  },
  "age":
  {
    "d": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA",
    "u": "0ACbHEmnqeXUJFMf1G2Dj0BU",
    "value": 43
  }
}

The schema has two levels of oneOfs. The first for the whole attributes block a and then a oneOF for each of the field blocks so that the following 4 partical disclosures are all valid.

Least

"a": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F"

Issuee and field labels

"a": 
{
  "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "u": "0AB9VADfPtCQvFqp-u4BxUvy",
  "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY",
  "LegalName": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
  "age": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA"
}

LegalName but not Age

"a": 
{
  "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "u": "0AB9VADfPtCQvFqp-u4BxUvy",
  "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY",
  "LegalName":
  {
    "d": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
    "u": "0ACsCxwKKCg0C9Hb7OX9ajbZ",
    "value": "Jason Colburne"
  },
  "age": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA"
}

Age but not Legal Name

"a": 
{
  "d": "EJgDHAe0lS3dWPB7yT78O2d1xb_AuNecU8VMjykVTd4F",
  "u": "0AB9VADfPtCQvFqp-u4BxUvy",
  "i": "ENoxXSSTfy8FDryU0J0av3IdHKqAb6aYBu0fIT5fvqfY",
  "LegalName": "EGcJzlAaalMxlFSfs2DPB7Tx7n3D7EKAWSXJaheVf_-P",
  "age":
  {
    "d": "EOYM0KDlDPODdiUDL7Xp-XRjr9mif7Dv5ovMSGTRxyTA",
    "u": "0ACbHEmnqeXUJFMf1G2Dj0BU",
    "value": 43
  }
}

I think nested partial disclosure accomplishes your use case without any ambiguity in the schema. The field values of undisclosed attributes may remain blinded but the field labels are disclosed.

This pattern of nested partial disclosure of blinded sub-blocks may be repeated to any number of levels. Where each level discloses the field labels of the next level down but does not disclose the field values. Or may partially disclose some of the field values while leaving the others blinded. The Issuer can’t play games with the schema because anyOf is not used only oneOF.

A verifier in an IPEX exchange then just removes oneOfs in the schema in their Apply message to force partial disclosure of the fields they need at any given nested layer.

SmithSamuelM commented 1 year ago

The one complication of using nested oneOf schema operators is to know how to the compute the SAID of a given block. We always use the most compact version of the block.

See https://github.com/trustoverip/tswg-acdc-specification/issues/79

SmithSamuelM commented 1 year ago

As I suggested above another way to accomplish “partial” disclosure is to use blinded edges in a graph of ACDCs.

jasoncolburne commented 1 year ago

Thanks for all your thought @SmithSamuelM and yes, I was forgetting that the whole reason to use selective disclosure is to prevent correlation - and nested partial disclosure is a great name, now that I understand your terminology. My mistake is that when it came time to implement I jumped to the end of the docs to understand the algorithms and data structures, and not the purpose/nomenclature. As a result, I thought the primary difference was that selective was composable and partial was all or nothing. In the end we may actually need to use a mix of selective and partial disclosure, our requirements are a moving target.

Maybe it's obvious from the fact that if correlation is denied one could present differing data to different parties, but I feel like others may make the assumption that selective disclosure disallows this kind of behaviour - specifically when the disclosed data is valid with respect to the same compact ACDC.

As we have noted, there is nothing forcing the issuer to create an ACDC that is still valid when fully disclosed (it may never be fully disclosed), and an adversary can construct data that can be presented and validated that is backed by two values for the same key - as long as the two values are never presented together. In some scenarios people may make system design assumptions like 'there is at most one non-revoked driving license ACDC issued to an individual from a motor vehicle authority', and because of this, they may assume that if they encounter an individual, that individual has at most one ACDC and thus at most one assigned legalName by a motor vehicle authority. But using the method I described, the individual can present two verifiable legal names. It just takes collusion with the right person employed at the motor vehicle authority.

I understand that breaking non-correlation with my original suggestion is a bad idea and goes against the intent of selective disclosure, as defined, entirely, but do you think it warrants some sort of warning or example that if one chooses to prevent correlation, it does become impossible to prevent this type of misrepresentation?

jasoncolburne commented 1 year ago

Sorry, didn't mean I came up with nested partial disclosure I just mean those were the same conclusions I came to when asking if we could label the saids (to nest the SADs).

jasoncolburne commented 1 year ago

A verifier in an IPEX exchange then just removes oneOfs in the schema in their Apply message to force partial disclosure of the fields they need at any given nested layer.

This is elegant

SmithSamuelM commented 1 year ago

@jasoncolburne

I understand that breaking non-correlation with my original suggestion is a bad idea and goes against the intent of selective disclosure, as defined, entirely, but do you think it warrants some sort of warning or example that if one chooses to prevent correlation, it does become impossible to prevent this type of misrepresentation?

Yes that would be a good caution to add. Historically selective disclosure mechanism have been the subject of discussion about how to prevent fraud against the verifier given that lack of correlatibility can make it easier for a presenter to share credentials without being detected. This means that some other mechanism must be added to disincentivize sharing of credentials. Hence my comment above that use of selective disclosure from untrusted issuers may be an anti-pattern.

SmithSamuelM commented 1 year ago

A valid use case for the selective disclosure mechanism in ACDC, in the case where the array of selectively disclosable values has repeated instances of the same sub-schema, is for static threshold proofs. this is a way to support multiple threshold proofs that do not require disclosure of the actual value without correlating to the other elements of the threshold array. The primary use case is for a set of legal age thresholds. So for example instead of disclosing the actual age of a person say 43 in order to satisfy a minimum legal age requirement, the age field means at least or age is >= the value in the age field. So an issuer whould issue a selectively dislosable array of the following age blocks

13, 18, 21, for someone age 43 and for someone age 70 the following

13, 18, 21, 55, 60, 65

The non-correlatability of the elements of the seletively disclosable array means a verifier checking say social media age of 13 would not know that the presenter was some other age. So all the legal age tests that a given person satisfied could be included in the selectively disclosable array while minimizing the ability of verifiers to deduce that actual age.

One would have to trust that the Issuer was trustable in not constructing a malicious array. which for legal age requirement would be a government issuer shich would be the only legally trustable issuer.

another applications would be for a credential to have repeated values but where each label was internationalized for languages. Then a given presentation would not have to be verbose by exposing all the field labels in a nested partial disclosure but only disclose the one element in the array that was the language of choice.

jasoncolburne commented 1 year ago

Great examples, thank you. We anticipate a mix of trusted and untrusted issuers and various use cases so this will be helpful in framing as we decide on a case by case basis which method is appropriate.

For posterity here is a schema I made for nested partial disclosure that uses the examples in this thread (edges and rules omitted):

{
  "$id": "EECobeq1Pbe-BHZxovIg3ND8vkN183xdkTp0LG3OK9dm",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Nested Partial Disclosure",
  "description": "A demonstration of Nested Partial Disclosure",
  "credentialType": "Demonstration",
  "version": "1.0.0",
  "type": "object",
  "required": [
    "v",
    "d",
    "i",
    "s",
    "ri",
    "a"
  ],
  "properties": {
    "v": {
      "description": "Credential Version",
      "type": "string"
    },
    "d": {
      "description": "Credential SAID",
      "type": "string"
    },
    "u": {
      "description": "One time use nonce - optional",
      "type": "string"
    },
    "i": {
      "description": "Issuer AID",
      "type": "string"
    },
    "ri": {
      "description": "Credential Registry Identifier",
      "type": "string"
    },
    "s": {
      "description": "Schema SAID",
      "type": "string"
    },
    "a": {
      "oneOf": [
        {
          "description": "Attributes section SAID",
          "type": "string"
        },
        {
          "$id": "EGjNHiyvgu7Yc3HGbu4tKD3AAaiNWjiGd8eOM8ALlNuz",
          "description": "Attributes section",
          "type": "object",
          "required": [
            "d",
            "dt",
            "i",
            "u",
            "legalName",
            "age"
          ],
          "properties": {
            "d": {
              "description": "Attributes SAID",
              "type": "string"
            },
            "dt": {
              "description": "Date and time of issuance in ISO8601 format",
              "type": "string",
              "format": "date-time"
            },
            "i": {
              "description": "Issuee AID",
              "type": "string"
            },
            "u": {
              "description": "Salty Nonce",
              "type": "string"
            },
            "legalName": {
              "oneOf": [
                {
                  "description": "Blinded legal name SAID",
                  "type": "string"
                },
                {
                  "type": "object",
                  "required": [
                    "d",
                    "u",
                    "value"
                  ],
                  "properties": {
                    "d": {
                      "description": "SAID of disclosable data",
                      "type": "string"
                    },
                    "u": {
                      "description": "Salty nonce",
                      "type": "string"
                    },
                    "value": {
                      "description": "Unblinded legal name",
                      "type": "string"
                    }
                  },
                  "additionalProperties": false
                }
              ]
            },
            "age": {
              "oneOf": [
                {
                  "description": "Blinded age SAID",
                  "type": "string"
                },
                {
                  "type": "object",
                  "required": [
                    "d",
                    "u",
                    "value"
                  ],
                  "properties": {
                    "d": {
                      "description": "SAID of disclosable data",
                      "type": "string"
                    },
                    "u": {
                      "description": "Salty nonce",
                      "type": "string"
                    },
                    "value": {
                      "description": "Unblinded age",
                      "type": "string"
                    }
                  },
                  "additionalProperties": false
                }
              ]
            }
          },
          "additionalProperties": false
        }
      ]
    }
  },
  "additionalProperties": false
}