w3c / vc-bitstring-status-list

A privacy-preserving mechanism to publish status information for Verifiable Credentials.
https://w3c.github.io/vc-bitstring-status-list/
Other
22 stars 19 forks source link

Unify design of status lists to support multi-bit values #73

Closed msporny closed 9 months ago

msporny commented 1 year ago

PR https://github.com/w3c/vc-status-list-2021/pull/65 added a backwards-compatible mechanism for binary status lists and multi-value status lists. During the discussion of the PR, there were multiple concerns raised around ttl, verification and validation and how the status list might affect those processes, the data formats utilized, the differences between binary and multiple bits per status. It was suggested that perhaps a re-alignment and unification of the design to support the use cases more generally would be useful. This issue tracks that design discussion.

msporny commented 1 year ago

Starting us off with a proposed unified design, example first:

{
  "id": "https://example.com/status/3#list",
  "type": "StatusList2021",
  "statusField": [{
    "statusBitOffset": 0, 
    "statusPurpose": "revoked",
    "activeStatusMessage": "This credential has been revoked by the issuer.",
    "inactiveStatusMessage": "This credential has been revoked by the issuer."
  }, {
    "statusBitOffset": 1, 
    "statusPurpose": "suspended",
    "activeStatusMessage": "This credential has been suspended by the issuer.",
    "inactiveStatusMessage": "This credential has not been suspended by the issuer."
  }, {
    "statusBitOffset": 7, 
    "statusPurpose": "recalled",
    "activeStatusMessage": "This credential has been recalled due to a health concern.",
    "inactiveStatusMessage": "This credential has not been recalled."
  },
      ...
  ],
  "encodedList": "H4sIAAAAAAAAA-3BMQEAAADAAAAAAAAAAAAAAAAAAAAAAAAAAAIC3AYbSVKsAQAAA"
}

Questions that are left unanswered:

Thoughts on this unified approach, @mprorock, @OR13, @dlongley?

OR13 commented 1 year ago

Why use literal strings for statusPurpose?

Seems like a URL would be better, and you could put the active and inactive language at the resource the URL dereferences too.

Let's get rid of all the sugar and add back only what needs to be understood by a verifier.

msporny commented 1 year ago

Why use literal strings for statusPurpose? Seems like a URL would be better, and you could put the active and inactive language at the resource the URL dereferences too.

Yes, I had considered that approach, but felt folks might reject the direction as being "too Linked Data-y". If we go that route, would there be standard URLs for revocation and suspension? (I'd presume so).

The messages might be useful for verifier platforms, if they wanted to elaborate on all of the statuses associated with credential. I thought traceability had a need for that, if not, we might want to simplify here.

mprorock commented 1 year ago

some inline comments below

  • Establishes a new statusField property, which contains the information for each field in the bitstring associated with each entry (including messages -- this is somewhat analogous to statusMessages in PR Support for multiple status codes #65).
  • Moves statusPurpose into statusField.

This is just naming - easy enough for implementers (ourself and others) to adjust naming if the WG goes this direction.

  • Moves away from hexadecimal notation for status bit position to just a plain integer offset (so we can avoid hex to

integer conversion... since we're trying to get to an integer bit offset anyway). that seems odd to me, we are looking at a bit backed structure and hex is a pretty normal for saying these bits mean x

  • Adds messages for active and inactive bits.

cool

  • Removes the size parameter, as it's not needed (you just pick the max statusBitOffset and add one to it's value to get the size.

definite size parameter feels like the right way rather than looking inside, looking for the largest number, and then adding one

  • Removes reference; do we need it anymore if there is a description for every field?

yes - we likely want to be able to point directly back to regulatory references etc

  • Removes ttl; can we get away with just validUntil?

no - ttl refers to the client (or cdn) question of "how often to i have to get a fresh copy of this thing?" - vaidUntil is when a new credential must be issued by the issuer

  • Removes the need for undefined fields... just leave a gap in your bit offsets if you want to leave fields undefined.

alot of this would be much easier/better in cbor

dlongley commented 1 year ago

@mprorock,

no - ttl refers to the client (or cdn) question of "how often to i have to get a fresh copy of this thing?" - validUntil is when a new credential must be issued by the issuer

My understanding is that the goal of ttl is to enable issuers to avoid having to reissue status list VCs more frequently. If I understand the aim here, I don't think it actually works in practice.

If you sign a VC with a validUntil of sometime "next year" and a TTL of 5 minutes, you are indicating that some verifier somewhere may accept whatever was in that VC (with your signature on it) anytime between when that VC was first accessible and "next year". The TTL doesn't change that; it isn't helpful in this regard and there is no reason why the verifier couldn't just ignore it.

Additionally, consider one of the use cases for StatusList2021, where a holder fetches a status list VC and presents it along with whatever other VC is in the list it describes (in order to avoid having the verifier even contact the issuer at all). That holder can completely ignore the TTL and the verifier will be receiving it "within the TTL" when the holder presents it. There's no signature over when the holder requested it that could be checked, etc. Even if there were -- this would upend the goal of ttl by requiring more frequent signatures again.

In this scenario, an issuer might be thinking that if they always set a ttl of 5 minutes in their status list VCs then they will only have to wait 5 minutes after they publish a new status list VC to ensure it's the one all the verifiers use. They would be mistaken. This is based on an erroneous understanding of the meaning of the data and how the data and the various decentralized systems interacting with it are partitioned.

So, right now, I don't think ttl achieves its goal and therefore ends up adding both unnecessary complexity and a potential violation of expectations.

An issuer should instead decide on a validity period they are willing to commit to (maybe it's 5 minutes, maybe 15 -- maybe 24 hours -- maybe more, their call) and reissue the status list VC with an updated validUntil either on demand or on schedule. Then a verifier just checks validUntil per usual and can safely cache the status list VC as long as it's valid.

So, as an issuer, if you want to allow your status list VCs to expire every 5 minutes, issue a new one if any VC status changes (understanding that the status won't be consistent until after the old status list VC expires) and auto-reissue when a status list VC is requested if it has expired, bumping validUntil 5 minutes into the future each time.

Regarding privacy-preserving but consistent status tracking for newly issued VCs, do not initialize your status list bits as scrambled, but assign from a pseudo-randomly selected index or slice of the list. If some kind of scrambling or other privacy-preserving setting of the bits is still somehow required, ensure that there are always some unallocated areas of the list that match what a newly issued VC status would be -- and choose from those sections of the list for any new VC, simultaneously preparing any other unallocated areas for the next one or more VCs to be issued in the future. In other words, ensure that over the next "validity period", there is always enough "prepared, unallocated space" for the maximum number of VCs you could issue over that period.

samuelmr commented 10 months ago

The Working Draft links to this issue in a note that seeks feedback.

I read #65 and #47 and wondered if the following syntax has been suggested or considered:

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2"
  ],
  "id": "https://example.com/credentials/status/3",
  "type": ["VerifiableCredential", "BitstringStatusListCredential"],
  "issuer": "did:example:12345",
  "validFrom": "2021-04-05T14:27:40Z",
  "credentialSubject": {
    "id": "https://example.com/status/3#list",
    "type": "BitstringStatusList",
    "status": [
      {
        "statusPurpose": "revocation",
        "encodedList": "H4sIAAAAAAAAA-3BMQEAAADCoPVPbQwfoAAAAAAAAAAAAAAAAAAAAIC3AYbSVKsAQAAA"
      },
      {
        "statusPurpose": "suspension",
        "encodedList": "H4sIAAAAAAAAA-3BMQEAAADCoPVPbQwfoAAAAAAAAAAAAAAAAAAAAIC3AYbSVKsAQAAA"
      },
      {
        "statusPurpose": "pending_review",
        "encodedList": "H4sIAAAAAAAAA-3BMQEAAADCoPVPbQwfoAAAAAAAAAAAAAAAAAAAAIC3AYbSVKsAQAAA"
      }
    ]
  },
  "proof": { ... }
}

IMHO, that would be an easy way for the verifier to get a bunch of status values about a credential. After parsing, they might end up with something like

{
  "id": "https://example.com/credentials/status/3#94567",
  "revocation": false,
  "suspension": false,
  "pending_review": true
}

That should be relatively easy to act upon.

The statusPurpose of a VC should be changed to an array:

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2"
  ],
  "id": "https://example.com/credentials/23894672394",
  "type": ["VerifiableCredential"],
  "issuer": "did:example:12345",
  "validFrom": "2021-04-05T14:27:42Z",
  "credentialStatus": [
    {
      "id": "https://example.com/credentials/status/3#94567",
      "type": "BitstringStatusListEntry",
      "statusPurpose": ["revocation", "suspension", "pending_review"],
      "statusListIndex": "94567",
      "statusListCredential": "https://example.com/credentials/status/3"
    }
  ],
  "credentialSubject": {
    "id": "did:example:6789",
    "type": "Person"
  },
  "proof": { ... }
}
msporny commented 9 months ago

@samuelmr wrote:

I read #65 and #47 and wondered if the following syntax has been suggested or considered:

Yes, we had considered that, but ended up with the following (more composable) construct in the spec today:

https://github.com/w3c/vc-bitstring-status-list/issues/37#issuecomment-1416348418

The statusPurpose of a VC should be changed to an array

That would be an incorrect way to model the data. Effectively, what you're proposing would say that the bit that handles "revocation", "suspension", and "pending_review" is bit 94567, which would be wrong (because flipping the bit would flip all 3 states). An alternate reading that presumes bit offsets based on the three items listed would prevent bit packing, which was a requirement from a subset of the group.

I believe the current specification, including the pending PRs, will enable the use case you were attempting to address in a more composable way, and in a way that supports bitpacking.

msporny commented 9 months ago

At present, there doesn't seem to be enough desire to rewrite/unify the design to support multibit values. The current specification achieves all of the use cases outlined and is "good enough" to enter Candidate Recommendation.

I'm marking this issue as "pending 7 day close". Feel free to object to closing this issue during that timeframe.

samuelmr commented 9 months ago

Effectively, what you're proposing would say that the bit that handles "revocation", "suspension", and "pending_review" is bit 94567, which would be wrong (because flipping the bit would flip all 3 states).

In my status list credential example there are three lists with different statusPurposes. Flipping a bit on one list wouldn't touch other lists.

I believe the current specification, including the pending PRs, will enable the use case you were attempting to address in a more composable way, and in a way that supports bitpacking.

You mentioned in https://github.com/w3c/vc-bitstring-status-list/issues/47#issuecomment-1872390915 that one can already today "[a]ssociate an array with credentialStatus, where each object of the array contains a different purpose". This was what I was suggesting but didn't see in the latest published spec.

IMHO, using bytes instead of bits adds complexity. For example, a library implementing this spec will have a validity-check API that returns either boolean or a string, depending on the statusPurpose. A user of that library (e.g., a verifier) has to be prepared for any status string values, perhaps dynamically querying a status-dictionary.

If the spec only had bit lists for different statusPurposes, the API would always return a boolean and the users would only need to list the statusPurposes they care about in their implementation.

Having said that, this is just friendly feedback and I don't object to closing this issue.

msporny commented 9 months ago

@samuelmr wrote:

Having said that, this is just friendly feedback and I don't object to closing this issue.

We appreciate the feedback... more thoughts below. The most important question I have for you is this:

Are we going to prevent you from implementing a particular use case you are interested if we don't make the changes you are suggesting? If the answer is "yes", then please say so and we'll do our best to address your use case.

Please keep that in mind while reading the rest of the response:

In my status list credential example there are three lists with different statusPurposes. Flipping a bit on one list wouldn't touch other lists.

Ah! I think I see what you're saying now. In your markup here:

{
  "id": "https://example.com/credentials/status/3#94567",
  "type": "BitstringStatusListEntry",
  "statusPurpose": ["revocation", "suspension", "pending_review"],
  "statusListIndex": "94567",
  "statusListCredential": "https://example.com/credentials/status/3"
}

... you have one single statusListIndex value. I guess what you are saying is that the offset in each list is the same for different purposes? You download the status list credential and then match the VC's credentialStatus.statusPurpose field up with the lists credentialStatus.status.statusPurpose field. Yes, that would work. I'd argue that we can already do something equivalent w/ the examples added to the spec below (you could also publish multiple credentialSubject values to achieve an equivalent to what you say above).

The downside with these "more compact" approaches is implementation complexity (though, arguably, the implementation complexity isn't that terrible). I'll try to get feedback from the group on these "more advanced" markup examples. The current set of implementers might be willing to implement these features, or they might not be willing... I'll see what they say.

You mentioned in #47 (comment) that one can already today "[a]ssociate an array with credentialStatus, where each object of the array contains a different purpose". This was what I was suggesting but didn't see in the latest published spec.

Yes, it's missing an example that shows one how to do it. That example exists here:

https://github.com/w3c/vc-bitstring-status-list/issues/37#issuecomment-1416348418

I've just raised PR #122 to put that example in the specification so that it's more clear to people how they can utilize that facility. We'll elaborate upon the example more during the Candidate Recommendation phase.

IMHO, using bytes instead of bits adds complexity.

Yes, agreed. To be clear, I am (personally) not fond of the feature, but there was a subset of the group that was likely going to formally object to the feature not going into the specification.

The strongest argument for the bytes approach is bitpacking LOTS of possible status values into a single byte (you can have up to 255 status values per 8 bits vs. only 8 status values if you treat each bit as independent from the other). There is, however, certainly a trade-off wrt. status information vs. privacy, the more status information you publish about a particular entity, the more you can correlate the behaviour of that individual (or the population).

So, while I agree with your complexity argument, a subset of implementers disagree and desire the feature. The best we can probably do is say that implementers do not need to implement the statusMessage feature in order to have a compliant processor (the feature is optional). I have raised PR #123 to address your concern.

msporny commented 9 months ago

PRs #122 and #123 have been raised to address the remaining concerns in this issue. This issue will be closed once PRs #122 and #123 have been merged.

TallTed commented 9 months ago

https://github.com/w3c/vc-bitstring-status-list/issues/73#issuecomment-1872595726 says

@msporny wrote:

which should be

@samuelmr wrote:

msporny commented 9 months ago

@TallTed fixed, thx.

msporny commented 9 months ago

PRs #122 and #123 have been merged, closing.