w3c / vc-data-model

W3C Verifiable Credentials Working Group — VC Data Model and Representations specification
https://w3c.github.io/vc-data-model/
Other
281 stars 97 forks source link

Make the usage of `@context` optional #947

Closed msporny closed 1 year ago

msporny commented 1 year ago

It has been suggested that the use of @context should be made optional. This would be a normative change to the current specification, as the use of @context is required in both JSON and JSON-LD. This issue is to discuss if this is a good idea and the mechanism(s) that could be used to accomplish the task.

dlongley commented 1 year ago

Since making @context optional is not ideal for interop (there being more than one way to do things), there must be good arguments for doing so.

First, in my view, any argument that makes a VC look and function just like a JWT (and uses the same centralized registry extensibility mechanism) is not a good argument -- because the JWT spec already exists and people can just use a JWT if that's what they really want. Instead, we should assume that some additional constraints will be needed in order for people to achieve interop with VCs -- especially given the three-party model, data linkability / mergeability, and level of openness that they intentionally introduce.

Second, if an argument is made that having to process @context is so harmful that an option to avoid it ought to override interop concerns, the argument needs to be very clear about what "processing @context" means -- so that the assertion can be commonly understood before further analysis and discussion. There may be ways to resolve processing concerns -- but that should be discussed over in #948.

melvincarvalho commented 1 year ago

I was lead here by the interesting post by @decentralgabe

The system being designed is of a decentralized nature

This comment is more of an observational nature, rather than a solution or proposal. I would like to record it, but there isnt an expectation of a response here, so do not feel the need to reply, unless you really want to

It strikes me that the verifiable claims data model, if @context is mandated introduces some hard dependencies:

There seem to be secondary soft dependencies within that web page:

Additionally linked vocabs may depend on

That's all fine and I appreciate this is versioned by specific year (2018) and number (1.1) and expected to be set in stone, assuming the vocabs are well maintained, which I expect it to be

However, adding this @context may be seen to introduce single points of failure, for systems that wish to go the extra mile with decentralization, and create something even more decentralized than https itself. Playing devil's advocate. What if one day w3.org gets hacked? What if w3id.org gets a bad actor? What if github stop redirecting 'bad' vocabs. Not all of these things are in the verification critical path, but some parts might be annoying enough to want to future proof them.

What would the solution be for systems that would like to use ability to verify claims, but not necessarily download and cache the @context from a webpage. Hashllinks everywhere, perhaps, though unsure that fixes everything.

I dont have a solution here, this is more an observation. But, I can picture dweb implementers complying with the spec 90% but leaving out parts they dont like. Which might be the reality today. Maybe that's OK.

Any advice for projects in the decentralized web looking at this technology?

msporny commented 1 year ago

@melvincarvalho wrote:

Any advice for projects in the decentralized web looking at this technology?

Yes, treat the values in @context as strings with known semantics, don't download them, don't cache them, hardcode against them. Done. :)

We had contemplated all that you say above in v1.0 and v1.1 and that is why the specification provides the entirety of the content and a hash for the base VC context. https://w3c.github.io/vc-data-model/#base-context -- and we'll continue to do that. Also keep in mind that the "dependencies" you call out are not required for operation of the system -- if you can resolve them, great, but don't build your PRODUCTION software such that you have to depend on resolving them. I'll address your concerns one by one below:

https protocol

You can perma-cache JSON-LD Contexts; ship your production software w/ static copies of all contexts if you're doing JSON-LD processing. If you're not doing JSON-LD processing, you can just check that each value in @context matches your expectation (ensuring that each of them uses @protected).

w3.org domain

Again, you don't have to download anything from there. Even if w3.org were to go away, it's backed up in Internet Archive and the US Library of Congress. There is an archival policy that would require multiple nation states to fail for the domain and its content to disappear completely.

https://www.w3.org/2018/credentials/v1 web page

Same as above.

w3id federation w3id.org domain w3id.org mapping service github.com mapping registry

None of these are required to exist forever. That said, for these to go away, multiple companies and people would have to fail at the same time, Microsoft would have to take down Github, all cloned w3id.org repos on hundreds of peoples systems would all have to simultaneously disappear. The greatest single risk is probably the domain disappearing (which is pre-paid up to a decade in advance). We have redundancies (listed on the public web page) when it comes to domain management for w3id.org.

http (no S) vocabs in particular xsd, dcterms security vocab (of which there is v1 and v2) cred vocab

As long as the w3id.org service or w3.org is available, we expect those vocabulary documents to be reachable. If the vocabulary documents are not reachable, nothing breaks (except possibly semantic reasoners, of which there are zero operating today... and if there were, we'd expect them to download and perma-cache the vocabulary documents in a production setting).

For those reasons, I don't believe your concerns are as dire as you might believe them to be. Can we do better? Yes, of course, we want to... but we're getting to the point where large companies (Microsoft / Github / Rackspace / Internet Archive / US Library of Congress / Arctic Code Vault) have to simultaneously fail for us to be put in a sticky situation... and that, after we've told people for YEARS that they should not load from the network in production.

Does that address most of your concerns, @melvincarvalho ?

melvincarvalho commented 1 year ago

Does that address most of your concerns, @melvincarvalho ?

Thanks for the detailed explanation. I think the trade-offs have been well expressed. There remains some nuances and mitigations, but probably off-topic for this thread, at this point.

OR13 commented 1 year ago

I'm a strong -1 to making @context optional in VCDM.

In my opinion, it defeats the entire purpose of the VCDM.

If you want to sign or encrypt JSON or CBOR, there are excellent standards at IETF, which you should use.

We don't need to make @context optional here, because it's already optional there... and making it optional here destroys interoperability here, while you already have interoperability there.

iherman commented 1 year ago

The issue was discussed in a meeting on 2022-10-19

View the transcript #### 3.1. followup of the TPAC discussion on JSON/JSON-LD. **Manu Sporny:** original issue has come from TPAC, but it seemed as if there were two separate issues.. _See github issue [vc-data-model#947](https://github.com/w3c/vc-data-model/issues/947)._ **Manu Sporny:** first one requests @context to be optional. … second one it to enhance developer experience of JSON-LD by limiting JSON-LD functionality. _See github issue [vc-data-model#948](https://github.com/w3c/vc-data-model/issues/948)._ **Manu Sporny:** so that a JSON only developer wont need to worry about JSON-LD processing. > *Kristina Yasuda:* I think it was correct to break down the original TPAC tracking issue into two, but will point out that a lot of good conversation is in that original issue that is now closed is not translated into the new issues. so will encourage folks to re-comment/re-engage.. **Manu Sporny:** intention is that you wont need a JSON-LD processor to use @context. … done by adding a developer @context, but this should not be used in production. **Brent Zundel:** folks should read the original issue for background and good information.
selfissued commented 1 year ago

I'm obviously in favor of making @context optional to make lives easier for developers who don't need JSON-LD, being the author of the comment https://github.com/w3c/vc-data-model/issues/929#issuecomment-1267697526 that is the basis of this issue.

dlongley commented 1 year ago

@selfissued,

I'm obviously in favor of making @context optional to make lives easier for developers who don't need JSON-LD...

If that is the goal, it sounds like it might be solvable in other ways which is the discussion in #948. Being specific about what is difficult should also be raised in that other issue so it can be more adequately discussed / addressed.

talltree commented 1 year ago

I have not had the bandwidth to more fully participate in this WG, however since this particular issue received many months of discussion in the W3C DID WG — where it was ultimately decided that @context would not be required in the plain JSON representation of a DID document — I wanted to share some of those learnings here.

Although I am not a developer myself, in my discussions over the past few years with developers of various types of "verifiable credentials" (meaning the generic term as defined by the VC 1.0 spec: "A set of one or more claims made by an issuer that has authorship that can be cryptographically verified"), the feedback I have gathered is: while JSON-LD and the RDF graph model is one perfectly good solution to decentralized semantic interoperability, it is not the only one.

What I have seen in the wild is at least three other solutions:

  1. Out-of-band agreement to an external specification. This might actually be the most widely deployed solution in the wild today, where the semantics of a credential are pre-determined by application context or pre-agreement.
  2. JSON Schema or another well-known schema language. These have matured a great deal in the last decade.
  3. Labeled property graphs. I am nowhere near an expert on LPGs — I learned about them from Sam Smith — however I later realized the LPG model is what the OASIS XDI Technical Committee (of which @peacekeeper and I were co-chairs) ended out using for semantic interoperability.

None of these are abstract examples — all in use today. For example, LPGs (expressed using JSON Schema) are the basis for semantic interoperability of the ACDC credential format used by GLEIF vLEI credentials that are on a path to be adopted by over 70 financial services regulators around the world.

My point being: by all means use @context when a credential uses a JSON-LD representation. That's the mechanism JSON-LD defines to control semantic validation in its representation of the VC data model.

But if a credential uses a different representation, that representation should define its own mechanism to control semantic validation. I believe that's what @selfissued meant when he said, "Let JSON be JSON".

peacekeeper commented 1 year ago

many months of discussion in the W3C DID WG — where it was ultimately decided that @context would not be required in the plain JSON representation

Yes, but I would add here that the goal of the DID WG was to support multiple representations that could be losslessly converted! Whether or not the DID WG succeeded in that, or how complex that process has become, is another story.

If the VC WG ends up deciding that additional representations (such as JSON without @context, ACDC, ISO mDL, Anoncreds, etc.) should all be supported by the VCDM 2.0 specification, then the WG should also answer the question whether or not interoperability and lossless conversion between the different representations is supported, or not. If that isn't clearly explained in the specification, then simply saying that "@context is optional" will lead to divergence and confusion.

peacekeeper commented 1 year ago

If you want to sign or encrypt JSON or CBOR, there are excellent standards at IETF, which you should use. We don't need to make @context optional here, because it's already optional there...

That makes sense to me..

If someone wants to use plain JSON (without @context) as a data model, and use JWS for signing, why not just use good-old plain JWT/JWS/etc. which already exists, is mature, widely deployed, etc.?

What would a VC-without-@context actually look like, does anyone have an example?

OR13 commented 1 year ago

Here are some examples:

Especially this one:

{
  "iss": "https://spec.smarthealth.cards/examples/issuer",
  "nbf": 1663010381.698,
  "vc": {
    "type": [
      "https://smarthealth.cards#health-card",
      "https://smarthealth.cards#immunization",
      "https://smarthealth.cards#covid19"
    ],
    "credentialSubject": {
      "fhirVersion": "4.0.1",
      "fhirBundle": {
        "resourceType": "Bundle",
        "type": "collection",
        "entry": [
          {
            "fullUrl": "resource:0",
            "resource": {
              "resourceType": "Patient",
              "name": [
                {
                  "family": "Anyperson",
                  "given": [
                    "John",
                    "B."
                  ]
                }
              ],
              "birthDate": "1951-01-20"
            }
          },
          {
            "fullUrl": "resource:1",
            "resource": {
              "resourceType": "Immunization",
              "status": "completed",
              "vaccineCode": {
                "coding": [
                  {
                    "system": "http://hl7.org/fhir/sid/cvx",
                    "code": "207"
                  }
                ]
              },
              "patient": {
                "reference": "resource:0"
              },
              "occurrenceDateTime": "2021-01-01",
              "performer": [
                {
                  "actor": {
                    "display": "ABC General Hospital"
                  }
                }
              ],
              "lotNumber": "0000001"
            }
          },
          {
            "fullUrl": "resource:2",
            "resource": {
              "resourceType": "Immunization",
              "status": "completed",
              "vaccineCode": {
                "coding": [
                  {
                    "system": "http://hl7.org/fhir/sid/cvx",
                    "code": "207"
                  }
                ]
              },
              "patient": {
                "reference": "resource:0"
              },
              "occurrenceDateTime": "2021-01-29",
              "performer": [
                {
                  "actor": {
                    "display": "ABC General Hospital"
                  }
                }
              ],
              "lotNumber": "0000007"
            }
          },
          {
            "fullUrl": "resource:3",
            "resource": {
              "resourceType": "Immunization",
              "status": "completed",
              "vaccineCode": {
                "coding": [
                  {
                    "system": "http://hl7.org/fhir/sid/cvx",
                    "code": "229"
                  }
                ]
              },
              "patient": {
                "reference": "resource:0"
              },
              "occurrenceDateTime": "2022-09-05",
              "performer": [
                {
                  "actor": {
                    "display": "ABC General Hospital"
                  }
                }
              ],
              "lotNumber": "0000001"
            }
          }
        ]
      }
    },
    "rid": "MKyCxh7p6uQ"
  }
}

^ note that this data is not a valid W3C Verifiable Credential...

but it looks a lot like one...

This kind of thing muddies the waters and makes the standards less useful, and should be discouraged because it creates confusion and is harmful... it would have been better for them to just use a vanilla JWS / JWE.

dlongley commented 1 year ago

I think that "saying there are a lot of other ways to do things out there in the wild" is not a good enough argument to make context optional -- it's an argument that will harm interoperability. Those other things "out there in the wild" aren't being standardized -- and if we say "you can pretty much do anything", then we're not standardizing anything here either.

I should add -- if any of those things "out in the wild" are already standardized and they solve someone's use case, then that person should go use them, not try to change what we're doing here to be the same thing with a different name.

talltree commented 1 year ago

@dlongley Your comment goes to the very heart of the fundamental question for this WG: Is the goal of the W3C VC 2.0 standard to be:

  1. The standard for JSON-LD-based digital credentials?
  2. The standard for multiple representations of digital credentials?

If the former, then simply bite the bullet and require all W3C-conformant digital credentials to use JSON-LD, period. But recognize that means there's no bridge with JWT/JWP, ISO mDL/mDOC, ToIP ACDC, AnonCreds, ICAO DTC, and potentially others coming out of the EUDI work.

If the latter, then the VC WG needs to tackle the issue of multiple representations of the VC Data Model.

talltree commented 1 year ago

If the VC WG ends up deciding that additional representations (such as JSON without @context, ACDC, ISO mDL, Anoncreds, etc.) should all be supported by the VCDM 2.0 specification, then the WG should also answer the question whether or not interoperability and lossless conversion between the different representations is supported, or not.

@peacekeeper Lossless conversion of DID documents was possible because DID documents are not required to be digitally signed. VCs must be digitally signed. So that means lossless conversion is not an option with VCs (crypto engineers please correct me if I'm wrong).

However lossless conversion is not the only way to achieve semantic interoperability. The data in a a claim is still the data no matter what representation is used to serialize it and what cryptographic algorithm is used to sign it. In my mind, the question is whether a machine can do the mapping of the data between the different representations.

Per my comment above, if W3C VC 2.0 is JSON-LD only, then that mapping is achieved entirely using JSON-LD contexts. If W3C VC 2.0 supports multiple representations, then it needs to specify how each representation supports mapping of claims to another representation.

peacekeeper commented 1 year ago

do the mapping of the data between the different representations.

@talltree I think that's what I meant by "lossless conversion". Sorry if I didn't make it clear. Of course signatures and proofs only work on the original representation, but the question is whether or not it should be possible to map the actual claim data and semantics between representations.

msporny commented 1 year ago

the goal of the DID WG was to support multiple representations that could be losslessly converted!

Yes, and in hindsight, that was a huge mistake. To this day, we only have two JSON-based representations that only differ in the use of @context. The WG wasted close to a year to get everything sorted out there only to end up with an overly complex data model that didn't result in any significant feature differentiation in implementations.

the VC WG needs to tackle the issue of multiple representations of the VC Data Model.

Defining multiple representations (as it pertains to supporting ISO mDL/mDOC, ToIP ACDC, AnonCreds, ICAO DTC, and potentially others coming out of the EUDI work) is not listed as "in scope" for the Verifiable Credentials 2.0 Working Group Charter. Those things are not Verifiable Credentials -- they are different technologies, with different data models, and different representations/serializations. It is possible to logically express the information in each of those technologies using the VC Data Model, but that is not what those other technologies have chosen to do.

All we can do is provide an example of how those other data models can be represented using the VC Data Model in a "logically equivalent" way. Calling those other things "W3C Verifiable Credentials" will only confuse the market... even calling them "verifiable credentials" is problematic and misleading, given the name and status of the W3C Verifiable Credentials specification.

nadalin commented 1 year ago

Neither is keeping only the exiting data model and there is no out of scope for multiple.

Get Outlook for Androidhttps://aka.ms/AAb9ysg


From: Manu Sporny @.> Sent: Thursday, October 20, 2022 10:12:13 PM To: w3c/vc-data-model @.> Cc: Subscribed @.> Subject: Re: [w3c/vc-data-model] Make the usage of @.` optional (Issue #947)

the VC WG needs to tackle the issue of multiple representations of the VC Data Model.

Defining multiple representations of the VC Data Model is not listed as "in scope" for the Verifiable Credentials 2.0 Working Group Charterhttps://www.w3.org/2022/06/verifiable-credentials-wg-charter.html.

— Reply to this email directly, view it on GitHubhttps://github.com/w3c/vc-data-model/issues/947#issuecomment-1286084545, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AB4R4Y4POQXQ6UK5XTT45TLWEGRR3ANCNFSM6AAAAAARCQJW3A. You are receiving this because you are subscribed to this thread.Message ID: @.***>

mprorock commented 1 year ago

Yes, and in hindsight, that was a huge mistake. To this day, we only have two JSON-based representations that only differ in the use of @context. The WG wasted close to a year to get everything sorted out there only to end up with an overly complex data model that didn't result in any significant feature differentiation in implementations.

+1

Seriously opposed to making context optional. There are other paths here that can preserve the semantics on a VC / VP while allowing for easier developer onboarding etc. Notably 953 would simplify the developer side, while ensuring that we keep semantics at the core data model level

talltree commented 1 year ago

@mprorock I certainly understand your perspective — and that of many other members of the WG — that JSON-LD (and its underlying RDF graph model) is a fundamental component of the W3C VC standard and should not be compromised in any way.

I also understand the perspective of other members of the WG who feel that the W3C VC standard should support a plain JSON representation that does not use either JSON-LD or the RDF graph model because they wish to use other solutions to semantic interoperability such as JSON Schema.

So it strikes me that the WG is coming to a Great Divide. It can go down one path or the other. But not both. The sooner the WG makes a decision, the sooner these questions can finally be laid to rest.

melvincarvalho commented 1 year ago

Since making @context optional is not ideal for interop (there being more than one way to do things)

@dlongley It might be helpful if you could elaborate on the issue with interop. More than one way of doing what?

Given that @context in this spec is designed to be an immutable blob that never changes, and downloaded and cached, in, say, the parser. What is its function?

Does it, for example, identify the thing you are parsing as a verifiable claim, in cases where the parser didnt know that before? Could it be instead got from the @type?

Does it give the version of the VC being used (2018) to help the parser canonicalize it and verify it. Possibly yes. However there's only one version (ie URI) right now in the spec, so I dont see parsers getting confused.

1-2 sentences on what specifically the barriers to interop would be, might help give a background to those following.

There might be a compromise along the lines of @context SHOULD be mandatory in this version of the spec, and v.next it will be a MUST as there will be two to choose from.

David-Chadwick commented 1 year ago

@talltree "because they wish to use other solutions to semantic interoperability such as JSON Schema." I do not see using JSON schema, whilst keeping a nominal @context property, are incompatible. They serve different purposes and both can work together. The solution to have the equivalent of an 'unregistered context' present for those who do not wish to process JSON-LD seems to be a perfect compromise to interworking.

peacekeeper commented 1 year ago

Those things are not Verifiable Credentials -- they are different technologies

I have to say, if I look at the example in https://github.com/w3c/vc-data-model/issues/947#issuecomment-1285489434, to me this looks like a good old JWT, not like a VC.

This example doesn't have the issuer and issuanceDate VC properties; instead it has the JWT equivalents.

It doesn't have a @context, and it doesn't have a proof, instead it does those things in JWT native ways.

Yes that example has type and credentialSubject, but does that by itself really create any kind of practical interoperability, if everything else in that JWT example is different from how things are done in VCs? Maybe it would be best for implementers to decide whether they want to use JWTs or use VCs, but to stop intermixing VC constructs and JWT constructs in complicated ways.

Perhaps It’s time to let VCs be VCs :)

lrosenthol commented 1 year ago

Most of this discussion about using @context seems to be around the base/default contexts - but that's not the point of having it. The use case, IMO, is to provide the equivalent of namespacing for JSON(-LD) when adding additional/custom information, that may need to be processed/understood and/or validated downstream.

Without @context, I don't see a way to provide extensibility to a VC w/o the possibly of semantic conflict...

dlongley commented 1 year ago

@melvincarvalho,

It might be helpful if you could elaborate on the issue with interop. More than one way of doing what?

With respect to interoperability, it actually doesn't matter "what". Generally speaking, having more than one way to do something harms interoperability, so a good argument needs to be made that any optionality is worth it. Sometimes it is -- sometimes it isn't.

If @context becomes optional, then there are now two ways to do what it does -- which you were asking about. @context provides globally-unambiguous mappings for terms (aka term definitions) -- and not just for the core terms as @lrosenthol has explained above. If you don't provide those via @context, you need to provide them some other way. If you don't provide specific term definitions at all, you need to signal that you're opting out of globally-unambiguous terms so that choice can be distinguished from a mistake.

Given that @context in this spec is designed to be an immutable blob that never changes, and downloaded and cached, in, say, the parser. What is its function?

This is not accurate. You're mistaking the core context from the extensions, both of which use the @context field. That being said, specifying the core context in a VC provides self-describing core term definitions, acts as a well-known signal for people consuming VCs so that they can identify whether the JSON they are looking at is intended to be a VC, enables linked data transforms, framing, and merging capabilities, and anyone who doesn't read the VC spec at all, and is just consuming linked data, can build knowledge graphs.

It also establishes and demonstrates the extensibility model. Today, every VC must have at least two contexts, the first is the core context and the second provides the term definitions for the specific VC type.

Could it be instead got from the @type?

No. This would not sufficiently provide term definitions -- and used in isolation would require types to be full URLs, disallowing idiomatic JSON typing. It would also fail to provide the other features mentioned above.

1-2 sentences on what specifically the barriers to interop would be, might help give a background to those following.

VCs require at least two contexts, the core context that defines the core terms, and a secondary context that defines the terms for the specific VC type. If @context is the extensibility mechanism in only some VCs and not others, then a second mechanism is required, harming interoperability if both mechanisms are not implemented by everyone. Additionally, if the separate formats cannot be mapped onto each other in all cases, then interoperability is further harmed.

OR13 commented 1 year ago

VCs require at least two contexts,

This is not true today, and it should not be true tomorrow.

You can't represent anything useful in v1.1. with only 1 context.... but that does not need to be true of 2.0.

dlongley commented 1 year ago

@OR13,

This is not true today, and it should not be true tomorrow.

To clarify, if you want to use idiomatic JSON, this is a requirement today. It is true that you can instead use full URLs for your JSON keys and type values otherwise.

You can't represent anything useful in v1.1. with only 1 context.... but that does not need to be true of 2.0.

Yes, we can do all kinds of things in 2.0 (good or bad).

OR13 commented 1 year ago

Counter examples (from v1.1):

(second one is bugged in the demo), but the RDF is the same for both.

My point is that you can produce valid VCs and VPs that only have 1 context... not that these valid examples are "good" or "bad"... they are "spec legal"... and they are verifiable.

dlongley commented 1 year ago

@OR13,

I don't think we need to continue discussing that you can make either useless, but legal VCs or non-idiomatic-JSON VCs in 1.1 when not adding a second context. That point has been readily conceded. However, I think that it's missing the forest for the trees. To make a useful or idiomatic-JSON VC in v1.1, you need a second context -- which is what mattered in my argument above.

OR13 commented 1 year ago

To make a useful or idiomatic-JSON VC in v1.1, you need a second context -- which is what mattered in my argument above.

You don't though... you can inline the second context.... and it's wise to do that in many cases where you don't know where processing will be done... because it has an impact on things like CORs, etc... There have been several long threads on this subject.

If the WG fails to include a vocabulary in the core context, expect everyone to recommend doing this:

https://github.com/OR13/did-jwk/blob/main/src/index.js#L158

If your software allows a user to define custom terms, you will be using vocab... or you will be asking the user to write and host a JSON-LD context file... one of them is trivial the other is not.

David-Chadwick commented 1 year ago

a much simpler solution is that every JSON-only implementor always includes just two @contexts, that are defined in the v2 standard, namely the v2 standard context and the 'unregistered' context which contains the@vocab.

dlongley commented 1 year ago

@OR13,

I don't think your comment stands in contrast to mine... I added emphasis below:

To make a useful or idiomatic-JSON VC in v1.1, you need a second context -- which is what mattered in my argument above.

You don't though... you can inline the second context

It's a different argument to talk about inline vs. context referenced by URL, regardless, a second context is needed, which is what mattered in my argument above.

The rest of your comments -- please take to issue #953 because I think that's where that discussion is happening.

dlongley commented 1 year ago

@David-Chadwick,

I agree with your last comment, but please also make it in #953 because that's where that discussion is happening.

dhh1128 commented 1 year ago

I am totally okay with the comment that @peacekeeper made about letting VCs be VCs -- but this seems at odds with the big tent strategy that I heard @brentzundel espouse as the preferred agenda for VC 2.0. We can't have it both ways -- big tent and @context. So I guess I'm agreeing with @talltree 's comment about coming to a Great Divide.

msporny commented 1 year ago

We can't have it both ways -- big tent and @context. So I guess I'm agreeing with @talltree 's comment about coming to a Great Divide.

@talltree's binary proposal is a false choice; the premise is rejected. There are options being discussed in #948 and #953 that are more reasonable compromises. Variations of those proposals would:

  1. Enable JSON-only developers to digitally sign w/ VC-JWT, not have to create any JSON-LD Contexts, and perform zero JSON-LD processing. The downside being that what terms mean will have to be communicated out of band. Some view this as a feature, while others don't.
  2. Enable JSON-LD developers to process these files using standard tooling.

These proposals would allow JSON-only, JSON-LD, and VCs secured using JWTs and Data Integrity to co-exist without having to go through any sort of Great Divide, thus staying in well within the bounds of @brentzundel's "big tent" statements made at W3C TPAC.

OR13 commented 1 year ago

So I guess I'm agreeing with @talltree 's comment about coming to a Great Divide.

It first formed years ago, when we tried to improve on JOSE / COSE and Authorization Servers as the center of the universe.

Yes, the divide exists, and should be preserved!

You want to sign arbitrary JSON / CBOR? Use JOSE / COSE, excellent existing standards exits at IETF!

You don't need any fancy zero knowledge cryptography, or blockchains or anything... and its worked like this for years.

JOSE and COSE are the safest, longest used, most widely available standards for this use case (sign arbitrary unstructured data).

The W3C has no business reinventing them... And luckily we don't need to, because they already exist, just this weekend I was using a JWS, and it worked amazingly well without being associated with anything related to the W3C.

You want to create signed semantic verifiable credentials that are interoperable and queryable as linked data or other graph formats?

Use W3C Verifiable Credentials.

Destroying the divide is not investing in diversity, it is investing in monoculture.

It's kicking over other people's sand castles (JOSE with JSON-LD) when you have your own working perfectly fine (JOSE with JSON).

I sense we are going to waste a tremendous amount of time, trying to tell people to just use JWT / JOSE.... and that makes me really sad... Because they can already do that today, and in many cases, they already are.

talltree commented 1 year ago

@msporny: given that I'm not a developer, can you explain to me in layman's terms what problem it solves for a plain JSON document that doesn't use JSON-LD to include an @context statement? The feedback I hear constantly is that, while it technically "doesn't hurt" the document because you can still apply JSON-only processing, at the same time doesn't make any sense — it feels wrong to developers to include a statement that indicates a JSON document uses JSON-LD when it fact it does not. What's worse, it means a relying party cannot use the @contextstatement to trigger JSON-LD processing because the document may not be JSON-LD.

What am I missing?

OR13 commented 1 year ago

@talltree If you want to sign JSON, you don't need the W3C's help, here is a library I use that signs JSON really well: https://www.npmjs.com/package/jose

What's worse, it means a relying party cannot use the @context statement to trigger JSON-LD processing because the document may not be JSON-LD.

This problem exists already and is solved for in other proposals.

talltree commented 1 year ago

@OR13, you didn't answer my question: why is it so important for a non-JSON-LD document to include an @context statement if does not actually use JSON-LD?

OR13 commented 1 year ago

why is it so important for a non-JSON-LD document to include an @context statement if does not actually use JSON-LD?

What do you think a JSON-LD document looks like?

To me, it looks like a JSON Document with an @context....

As I said at TPAC, and on pretty much every call I have ever been on, the Verifiable Credentials Data Model is over JSON-LD... not JSON.... The folks who want it to be over vanilla JSON have plenty of options that work today... why are those options not working for the use case of "signing over JSON" ?

dhh1128 commented 1 year ago

What do you think a JSON-LD document looks like? To me, it looks like a JSON Document with an @context...

That might be what it looks like, but that's not actually what it is. It must be JSON that matches certain documented but non-obvious constraints that the JSON developer has to discover (often, by trial and error) and accommodate and test in her/his code. Some come from the VC data model and the terms that its base context defines/requires. Others come from JSON-LD practicalities (e.g., the fact that JSON-LD wants arrays to be treated as unordered, so to undo this, the vanilla JSON's @context must override; also, Manu's oft-repeated dictum that nobody should ever actually resolve a @context at runtime for security reasons).

None of these are particularly hard problems to resolve -- but they are burdens on the vanilla JSON developer that confer no benefits except for the JSON-LD community. They allow interoperability in one direction only -- JSON-LD stacks can consume the other form, IF the vanilla JSON developers obey their rules, but not the other way around. That's why this concern won't go away, I feel. If/when we contemplate an interoperability strategy that confers benefits and imposes costs in both directions, I think the dynamic will change.

I think you are exactly right, @OR13 , that "the Verifiable Credentials Data Model is over JSON-LD" -- not because the wording of the spec strongly requires a JSON-LD worldview (we bent over backward to weaken that wording as far as we could while still maintaining minimum consensus: "Just add a @context statement to your JSON; why could you object to that?") -- but because for practical purposes, anybody who does anything else will either be bludgeoned by the community for heresy, or become exhausted by many painstaking adjustments for tribal knowledge and compromise. The compromises Manu alludes to, being discussed on other threads, are exactly what I'm talking about. Now we can't use inline @context, even though JSON-LD allows it? Makes it pretty hard to generate JIT credentials that exactly match a proof request... Etc.

talltree commented 1 year ago

So @OR13, please understand, I'm not pushing back against JSON-LD. I totally get all the reasons for using it. The reason I'm bringing up "the Great Divide" is that there are members of the WG that do not plan to use JSON-LD and to them it looks like the message is, "You should go elsewhere." If that's the case, then all I'm suggesting is that the WG should recognize that fact and make an explicit decision that "W3C VC" = "JSON-LD only".

If that's not the case, and the WG wants to support both JSON-LD and plain-JSON-that-is-not-JSON-LD, then that too should be an explicit decision.

OR13 commented 1 year ago

That might be what it looks like, but that's not actually what it is.

Actually, this is exactly what a JSON-LD Document is... speaking from a position of a painful amount of experience on this subject...

I add an @context to a JSON Document... to make it a JSON-LD Document.... and leverage the properties that come with that... such as graph translations / querying....

I remove an @context from a JSON document when I want to destroy the ability to leverage the graph processing that it enables.

Proposals to destroy interoperability that is currently being leveraged, to achieve interoperability that is already available (JWS / JWE / JOSE)... do not make sense to me.

Now we can't use inline @context, even though JSON-LD allows it?

Yes, we made this illegal in v1... I'm not convinced it was the right move, it was sold as "for the good of vanilla JSON users"... but I don't think they care... because people who process JSON-LD as JSON simply ignore terms they don't care about.

OR13 commented 1 year ago

The reason I'm bringing up "the Great Divide" is that there are members of the WG that do not plan to use JSON-LD and to them it looks like the message is, "You should go elsewhere."

I'm not sure this framing is generous enough for my taste : )

Telling people to follow the standard as it is written today where @context is required, is not the same as telling people to go elsewhere.

Pointing out that there are existing standards that support what people are asking for that are outside of the W3C, is not the same as telling them to go away, its the correct response to an attempt to redo work that will be harmful and wasteful to W3C members and their time.

If that's not the case, and the WG wants to support both JSON-LD and plain-JSON-that-is-not-JSON-LD, then that too should be an explicit decision.

I agree with this, let's be direct.

I explicitly joined this WG to work on JSON-LD semantic verifiable credentials.

I explicitly did not join this WG to do the same work I am doing at IETF (define ways to secure JSON and CBOR).

I don't believe the W3C should redo work that is already happening at IETF.

I don't think it makes sense to frame the W3C as the place to work on "securing JSON or CBOR".

I do think the W3C is the right place to discuss securing "JSON-LD based Verifiable Credentials".

dhh1128 commented 1 year ago

That might be what it looks like, but that's not actually what it is.

Actually, this is exactly what a JSON-LD Document is... speaking from a position of a painful amount of experience on this subject...

If your statement is about a generic JSON-LD document, it is true. If it is about a JSON-LD document that conforms to the VC data model, it is not true.

stenreijers commented 1 year ago

You don't need any fancy zero knowledge cryptography, or blockchains or anything... and its worked like this for years.

Zero-knowledge based Verifiable Credentials (ZK based VCs) must be part of this discussion in my opinion, as they may be the most dominant use-case of VCs in the years that come, in line with the eIDAS 2.0 guidelines from the European Commission that state that the privacy of the holder is a very important topic.

In the context of ZK based VCs, there is a big problem that it requires solving and where a standard can help: there must be a way to specify the verifiable credential in a clear JSON data format and there must be a clear specification on how to transform attributes/claims in the VC into a canonicalized form (as integer values) such that they can be used for the underlying cryptography functions: sign, verify compatible with the zkp.

The space of new zero-knowledge cryptography suites and the corresponding anonymous credentials is growing rapidly and they are searching for a clear VC data format and a clear way to specify how such proofs can be created inline with that VC data format. In fact, interoperability from a ZK based VCs point of view comes from a well defined proof-suite that includes a transformation algorithm to prepare the VC data format as valid integer numbers for the cryptography functions to ensure the authenticity and integrity of the VC. Interoperability in this point of view is not about JSON-LD contexts. Its all about specifying exactly how the attributes in a VC are transformed such that the included proof can be verified.

Yes, I know that JSON-LD can help to provide clear machine readable semantics and clear canonicalization algorithms as the BBS+ signature suite shows. But this is not always the case… One clear example is the large Idemix community of anonymous credentials. A JSON data format standard is more important here as the first step to interoperability and clear specifications on how the attributes of a VC in a particular data standard are to be transformed to be used in agreement with the attached proof. There are numerous existing credential systems over the last 10 years that have their own implementations and their own custom extension on top of Idemix, they all need a standard such that interoperability can be achieved between them. Here the discussion is not about forcing JSON-LD contexts or not, its all about data format standards and a way to create standards for new proof types and all kinds of zero-knowledge extensions. In fact, on a larger scale, there may be a paradigm change on the internet: from only a hand full simple cryptography suites to protect messaging over the internet to a whole spectrum of zero-knowledge based cryptography suites for the exchange of verifiable credentials over the internet.

When the JSON-LD context is not optional but forced, it means that a VC document needs to have a context that explains all semantics, obey the JSON-LD spec and the resulting VC MUST be processed by a JSON-LD library by every client that is reading the VC. This is just a huge hurdle in terms of adoption and may even contribute to alternative standards for VCs that do not have this restriction. Comments in this thread and in other threads such as: “please go to the JOSE community (JWS/JWT/JWP) and try your luck there” are also way too easy. Looking at the trend of increasing number of ZKP suites, it seems reasonable that we just need a way on how to specify the data format of a VC (with or without JSON-LD) and a simple way to specify the corresponding ZK proof suite compatible with that VC format. Two things that W3C now facilitate: a VC spec and Data Integrity spec.

OR13 commented 1 year ago

When the JSON-LD context is not optional but forced, it means that a VC document needs to have a context that explains all semantics, obey the JSON-LD spec and the resulting VC MUST be processed by a JSON-LD library by every client that is reading the VC.

You can do zero-knowledge proofs over JSON with arbitrary members... JWP for example, operates on JSON... not JSON-LD.

IMO, IETF is the right place to apply ZKPs to JSON, and JWP specifically.

I'm actually a huge fan of securing JSON-LD using boring JSON related standards, such as JWP, SD-JWT, JWS, JWT and JWE :)

David-Chadwick commented 1 year ago

@talltree said

@OR13, you didn't answer my question: why is it so important for a non-JSON-LD document to include an @context statement if does not actually use JSON-LD?

Because the global infrastructure is made up of millions of wallets, issuers and verifiers. And an issuer cannot know if any wallet or verifier that will obtain the VC will be a JSON-only or a JSON-LD processor. By adding the two @contexts that are being proposed for JSON-only issuers, allows all of the wallets and verifiers to upload and process the VC without the issuer being aware of their JSON-LD capabilities. It avoids the great divide. (Because a JSON-LD issuer will obviously add the full set of @contexts that are needed for full and complete JSON-LD processing, which presumably JSON-only implementations can already process)

stenreijers commented 1 year ago

@OR13 All I am trying to say is that JSON-LD is not the problem here, the problem some/most of the community is focussing on is: how to formalize zero-knowledge proofs and their corresponding anonymous credentials in easy to read specifications regrading and corresponding proof formats. Although I believe in the same semantic web vision, just forcing JSON-LD into the specification at all cost may lead to issues in adoptability. Hence making it optional could be a strategic choice at this point in time.

msporny commented 1 year ago

Hence making it optional could be a strategic choice at this point in time.

That (supporting JSON-only without @context) was the strategic choice we made in the DID Core specification, and in hindsight, there are a non-trivial number of us that felt that move was a huge mistake that wasted a large chunk of the DID WGs time. It did not result in the outcomes that people were hoping for and made the specification many times more complex than it needed to be. We've run that experiment, and it was a failure. What's going to be different this time around? :)