w3c / vc-data-model

W3C Verifiable Credentials v2.0 Specification
https://w3c.github.io/vc-data-model/
Other
287 stars 103 forks source link

W3C VC-LD Specification and Layered VC Specification: = interoperability at the application layer of layered spec #982

Closed SmithSamuelM closed 1 year ago

SmithSamuelM commented 1 year ago

I have copied my comments from issue #947 because they got lost in the general discussion.

VD-LD and VCC

This a concrete proposal that IMHO resolves the primary conflict in the community. It has two features:

Discussion

Those who think that VCs are best implemented with a layered model where only the topmost layer may or may not be an open-world model as determined by the application use case can then go build a layered model. And in those cases where the topmost layer benefits from an open-world model based on RDF then those users of VCC can still have high interoperability with VC-LD. For those applications of VCC that do not benefit from an open-world model then the primary advantage of VC-LDs is of little benefit to them and they can then benefit from avoiding the complexity and drag of an open-world model.

Drag

Much of the drag (cognitive dissonance) that we as a community experience (IMHO) is a result of the improper layering of the security concerns associated with the V in VC, and not due to the non-interoperability of an open-world set of claims. The triples in the claims can be conveyed in every case by a container layer to which the claims are opaque. The claims layer may of course, reference any meta-data in the container layer, but operationally this can be viewed as logging the container as an artifact but operationally, all the container layer needs to achieve its function is limited to the closed world set of container properties (not its payload). These container properties look like meta-data to the claims layer or may include identifiers that are subjects of claims. But because the claims layer is opaque to the container layer, there are zero dependencies by the container layer on any claim in the claims layer so contained. Triples in the claims layer may be dependent on properties in the container layer but not vice-versa. To reiterate, this allows the container layer to be a closed world model of limited scope whereas the claims layer may still include any and all triples it so desires.

Protocol Design Flaw

The drag IMHO is due to mistaking the fact that the claims layer can reference properties in the container layer as a license to collapse or flatten the container layer leaving only the claims layer. This is indeed an unfortunate protocol design flaw that is the root cause of our difficulties.

This layering now allows us to limit the interoperability discussion to only those properties necessary to achieve the functionality of the container layer. Any discussion of an open-world model is obviously out of scope and would only add drag.

Such a layered model then allows the user to decide if the payload benefits from open-world interoperability or not. But the payload (including all its claims or triples) has already been authenticated. The container layer makes a cryptographic commitment to the payload. This usually means some serialization of the payload. Any serialization will do from the perspective of the authentication layer. The authenticity is limited to this serialization. Any later expansion that may dynamically introduce semantics may weaken what can be verified against the commitment to the original serialization. But what we have achieved is total interoperability of the authentication layer because it necessarily is a closed-world model. The payload is universally simply a serialized artifact such as a hash or Merkle root or accumulator of the payload regardless of the serialization type (be it JSON, JSON-LD, CBOR, MGPK, CESR or whatever).

Layered Model

Authentication Layer and Authorization Sub-layer

I propose that for a "verifiable" credential, what is primarily being verified is authorship or authenticity, where authenticity is defined to be secure attribution to a DID as issuer. A secondary verification for some credentials is an authorization where authorization is defined to be evidence of entitlement (the dictionary definition of credential). We then have a container that provides an authentication layer whose function to to provide verifiable proof of authenticity (aka secure attribution to a DID as issuer) of its opaque contents, and we have an optional authorization sub-layer that, when employed, provides proof of entitlement to some DID (or another verifiable cryptographic construct) as the target of the entitlement.

The opaque payload could be a true pure unadulterated JSON-LD document (or not, depending on the application use case). All claims in that payload would be authenticated before they are ever opened. Any properties needed to provide proof of authorization (entitlement) would necessarily be provided in the authorization sub-layer, but forensic information about the entitlee could be part of the payload because forensic information is not required to verify the entitlement but for other conditional purposes such as recourse or enforcement.

This layered approach IMHO removes most of the friction this community is experiencing. The security of the "verifiability" employs a closed-world model limited to the authentication and authorization layers, and the open-world interoperability is unfettered by concerns about what happens in the security layer below. Proofs of authenticity look sane because they are clearly poofs of the container of opaque claims and not some ex post facto proof mechanism of an expanded payload.

Presentation Layer

This layered model can be further extended to include the proof of presentation which is different from authentication of the Issuer and authorization by the Issuer.

A presentation exchange protocol then can include the details of graduated or selective disclosure of the payload claims when appropriate. In many cases the payload of forensic information need only be conditionally disclosed especially when the entitlement properties are cleanly separated. Then proof of the presenter as target of the entitlement is the primary function of the presentation proof.

But once again, any property of the presentation layer can be referenced in an open-world claims layer as a artifact or log of the proof but those references in the claims layer are not operationally involved as a dependency of the presentation layer. Once again, we avoid the mistake of wrongly collapsing the presentation layer into the claims (payload) layer merely because a property of the presentation layer may be referenced in the claims layer. We have the discipline to recall that a reference down into a lower layer does not give us a license to collapse that lower layer (the presentation layer in this case).

decentralgabe commented 1 year ago

before getting too deep into this proposal I think it's important we get to the heart of what "interoperability" means. this is certainly a viable path, but one that has tradeoffs...more specifically this means more code implementers need to write to make Verifiable Credentials work.

I think the best solution allows a minimal shared set of code to "work" for most, if not all, Verifiable Credentials.

By "work" I think a minimal set of functionality which I think of in term of validations:

The closer these layers can be to having a common base, the broader our out of the box interoperability can be, which I think is worth achieving.

So in my view the VC itself has three layers - signature, spec-defined data, spec-enabled data. I see LD, JSON Schema, and others in the last layer.

SmithSamuelM commented 1 year ago

@decentralgabe Not all layering is equivalent. The layering I am proposing is a function layering in the spirit of the ISO OSI protocol layering framework. Authentication, Authorization, and Presentation are formally defined functions in such a framework. These may or may take advantage of other constructs such as schema and function specific terminology. Because they are security focused functions, an open world model is antithetical to their function. Necessarily we want a very tightly defined set of roles and functions. In the ISO OSI framework layers are bounded by wrappers or envelopes. The contents of a higher layer is opaque to the function of a lower layer. This is a well established framework that has informed layered protocol design for 30 years.

mwherman2000 commented 1 year ago

before getting too deep into this proposal I think it's important we get to the heart of what "interoperability" means.

In reference to "interoperability" or "VC interoperability" or "interoperability with existing implementations of JSON-LD", which case are we talking about?

Are the JSON-LD enthusiasts primarily focused on VC interoperability? ...or interoperability/compatibility with some existing JSON-LD software implementations? ...for example, if there are JSON-LD implementations that can't or won't automatically assume the presence of a default @context when the @context attribute is not specified in a VC.

I think it might be more a case of the latter (which in turn is responsible for creating the former issue in terms of "VC interoperability" but only as a side effect).

...clearly JSON-LD software implementations are "just software" and the problem could be fixed on the JSON-LD side, n'est pas? Fix the JSON-LD implementations

...or create a W3C Note ;-) to describe what sort of decorations need to be hung on a VC to be compatible with that (JSON-LD) unique software platform.

TallTed commented 1 year ago

@mwherman2000 — Please revisit your https://github.com/w3c/vc-data-model/issues/982#issuecomment-1329440635 and wrap each and every instance of @context in backticks ("`") like so, `@context`, such that this GitHub user does not receive updates about this thread in which they are not participating of their own accord.

mwherman2000 commented 1 year ago

Those who think that VCs are best implemented with a layered model where only the topmost layer may or may not be an open-world model as determined by the application use case can then go build a layered model.

@SmithSamuelM I've developed a layered architecture reference model for DIDComm Agents based solutions to make it easier to explain the options that are available to developers creating decentralized identifier-based agent software systems. Perhaps the DIDComm-ARM can be extended further in the directions you described above.

In the meantime, here's a preview: https://hyperonomy.com/2022/11/29/didcomm-agent-architecture-reference-model-didcomm-arm-a-preview/

A detailed whitepaper will (hopefully) be available later this week.

SmithSamuelM commented 1 year ago

@mwherman2000 Certainly DIDComm ARM is more aligned with what a layered protocol means. The differences are that the layering I am proposing is not a channel based or session based protocol but works in a zero-trust authentic data at rest model. DIDcomm does a lot, too much IMHO. But the discussion should not be: should we have layers? but what layers should there be? and what are the properties of those layers? So I welcome a shift to the later as per your post.

mwherman2000 commented 1 year ago

So I welcome a shift to the later as per your post.

@SmithSamuelM I agree we're (you and I) are talking about different dimensions/sets of concepts in terms of an overall (broad and deep) model for decentralized identifier-based software systems (would welcome a better label).

I believe you're asking or are focused on: what are the dimensions/sets of concepts that can be used to layer a VC ARM. ...almost like a third axis (or set of axis) that rise out of Layer 3 in the DIDComm-ARM. Correct?

Is so, for VCs, is it a 2 x 2 matrix of dimensions? ...2 x 3? ...something else?

mwherman2000 commented 1 year ago

But the discussion should not be: should we have layers? but what layers should there be? and what are the properties of those layers? So I welcome a shift to the later as per your post.

@SmithSamuelM Here's a proposal for how to structure a layered verifiable credential model ...technically, it's more of a logical decomposition of a VC into its primary components. The components are:

  1. Verifiable credential envelope
  2. Inner credential
  3. Proof

If composited together, these 3 components make up one (1) (complete) verifiable credential.

Each component can then have its own (largely independent) taxonomy of mandatory as well as optional features associated with it. We won't know how independent these component feature sets are until they're enumerated on a component-by-component basis. Check it out... (click the diagram to enlarge it)

image

The above VC reference model is highly consistent with Figure 5 in https://www.w3.org/TR/vc-data-model/#credentials. Unfortunately in the current VC DM, the JSON serialization doesn't delineate the "metadata" component as its own JSON object (in the same way "credentialSubject" and "proof" are their own objects delineated by their own set of outer braces). That is, the JSON serialization is not consistent with Figure 5.

UPDATE: To be more obvious in the application of my terminology:

ChristopherA commented 1 year ago

I agree with Sam what is emerging from these conversations that the problem with VCs (in particular the never ending fight between JWT vs JSON-LD) is that both choices force other devs into structures & architecture they don’t like.

A better architecture with layers may solve this. I know in particular my community’s requirements for privacy/progressive trust can’t be addressed by the current architecture, thus we offered an alternative with Gordian Envelope.

mwherman2000 commented 1 year ago

Gordian Envelope

@ChristopherA can you add a link to your above post that describes (with examples) what a Gordian Envelope is. Thank you.

David-Chadwick commented 1 year ago

@ChristopherA

" my community’s requirements for privacy/progressive trust can’t be addressed by the current architecture,"

Could you say what these requirements are please. Clearly VCs depend upon a protocol, and I believe that a protocol can be created to selectively disclose PII and increase trust using the current DM.

SmithSamuelM commented 1 year ago

@David-Chadwick One of the important learnings with ACDCs is that selective disclosure is but one form of graduated disclosure. I am interested in @ChristopherA ideas on “progressive trust” seems might have some overlap with the concepts of graduated disclosure from ACDC of which contractually protected disclosure is one mechanism to build trust prior to further disclosure or which selective disclosure might be a mechanism for further dislcosure.

A layered protocol data model means we can talk about what goes in the layers. The current VC flat model really only contemplates two types of disclosure, full and selective. But graduated disclosure forces a more nuanced view of a disclosure protocol and therefore what goes in a disclosure protocol layer. This is what is historically called a presentation layer in the ISO OSI model. One of the nuances is that historically presentation was by a named or targeted identified entity by the issuer. Whereas a disclosure protocol is not so limited. A Discloser could be any entity, not merely the entity designated by the Issuer (e.g. and Issuee) if any. This nuance removes the holder binding ambiguity (see other issues on holder binding) A Discloser may or may not be an Issuee. The semantics of the act of disclosure may be different as a result, but how one arrives at agreement to the terms and conditions of disclosure is largely independent of those semantics thus making graduated contractually protected disclosure its own sub layer in a disclosure protocol layer.

SmithSamuelM commented 1 year ago

One of the severe limitations of the current VC model is that there is no mechanism for contractually protecting disclosure. Selective disclosure by itself implicitly permissions any use of the data so selectively disclosed. Its a free pass to any aggregator to de-anonymize the selectively disclosed data with any contextually linked information availabe to the verifier at the time of presentation. Whereas a graduated contractually protected disclosure can impose a liability on the verifier that prior to selective disclosure the verifier agrees not to de-anonymize the selective disclosed data nor to share it. I call this an anti-assimilation constraint. For example, in the classic proof-of-age selective disclosure example. I can use a cryptographic selective disclosure mechanism that makes it difficult for the verifier to link the age of the presenter to the presenter’s PII. I.e. no provable super correlator is provided by the presenter. But if the presenter chooses to pay their bar tab with a credit card, now the verifier has a super correlator the credit card number. As well as video footage from security camera’s theat they can use to deanonymize the person and defeat via contextual linkage the unlinkability of the selective disclosure. Obviously one could argue that one could pay with cash, But that limits the usefulness of the selective disclosure to a narrow subset of the user population. Whereas a graduated disclsoure with an anti-assimilation clause could be attached to both the age transaction and the CC payment transaction, limiting the contextual linkage of the two transactions to any purpose besides the transaction. The combination of selective and contractually protected disclosure is more comprehensive. It attaches strings to any disclosed data forbidding use not specifically permissioned.

David-Chadwick commented 1 year ago

Most of the above is beyond the VCDM because it depends upon the protocol used to transfer the credential. Furthermore, regardless of what protocol commitments are made an untrustworthy verifier can ignore them and still do aggregation or further disclosure (as we know from digital rights management). This is where the trust infrastructure comes in. So given that we need a trust infrastructure regardless of the VCDM and protocols, then we are now debating what functions can be left to the trust infrastructure's policies and procedures and which can be reliably dealt with using technology alone and which need a combination of the two. I believe SD requires both.

David-Chadwick commented 1 year ago

@mwherman2000 Where we appear to disagree is over the Verifiable Credential Envelope. Either that or you are missing another component from your model, namely the Credential Metadata. Note. This is meta information about the credential claims and is independent of the proof applied to the claims. So I am talking about such metadata properties as:

mwherman2000 commented 1 year ago

@David-Chadwick First, I'm glad (almost ecstatic) that we have started to have a mutual understanding on some terminology. I had actually given up hope on these conversations. Thank you for persevering.

Where we appear to disagree is over the Verifiable Credential Envelope. Either that or you are missing another component from your model, namely the Credential Metadata.

I would also like to factor out the "credential metadata" from the Credential Envelope. In fact, in my own work on Structured Credentials, I do. I call the metadata stuff the Packing List (reference: https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1). So again, you and I are traveling the same path (in the same direction).

So if I agree with you, why is something equivalent to Credential Metadata missing from my decomposition? It's fairly straightforward: because in the current JSON serialization of a Verifiable Credential, the metadata properties are not their "own thing" ...they are not grouped into their own property ...an historical accident or whatever you want to call it.

In https://github.com/w3c/vc-data-model/issues/982#issuecomment-1335933361, I added the following update at the end of the post:

There is nothing in the current VCDM that enables me to make a similar statement about the credential metadata. There no way to say, hypothetically,

This structuring is missing from the current VCDM. Are you and I still close? ...in terms of the direction of this discussion?

mwherman2000 commented 1 year ago

@David-Chadwick p.s. Without changing the current VCDM, we can address one thing on your wishlist, we can easily add an optional "metadata" property to the "inner credential" ...here's an updated "VC decomposition" diagram...

image

It's still a problem (without an easy solution) to group the metadata properties in the VC Envelope. Any thoughts/ideas?

David-Chadwick commented 1 year ago

I do not see metadata as being in the inner credential, since it is data about this data. All data in the inner credential is described by its metadata, which necessarily must be outside of it (as is your envelope). So where you have placed it cannot be correct.

So let me ask you about your envelope. What is its purpose?

And how does it differ from my concept of the metadata about the inner credential?

mwherman2000 commented 1 year ago

A big part of what I do is take new technologies and make them more understandable. The idea of a Verifiable Credential Envelope in which the Inner Credential is placed and the Envelope being subsequently closed and sealed (with a proof) is a set of abstractions that make VCs and the VCDM much easier to understand by software architects and developers, resulting in faster and higher adoption (hopefully).

@David-Chadwick have you had time to watch? https://www.youtube.com/watch?v=kM30pd3w8qE&list=PLU-rWqHm5p445PbGKoc9dnlsYcWZ8X9VX&index=1

Related, this weekend I finished a whitepaper where I had a similar set of goals for software architects and developers trying to understand and use DIDComm and DIDComm Agents. Here's a preview of the DIDComm Agent Architecture Reference Model (DIDComm-ARM): https://hyperonomy.com/2022/11/29/didcomm-agent-architecture-reference-model-didcomm-arm-a-preview/

David-Chadwick commented 1 year ago

You packing label is the credential metadata - you say so in your video. But your packing label is not mentioned in your diagram. So you are not including the metadata as a separate component in your diagram, which is the issue I brought up.

p.s. I should not have to watch videos for you to be able to explain to me your concepts. And I do not think the video is that helpful. So sorry, can we keep to text please. Otherwise the conversion will not be productive.

David-Chadwick commented 1 year ago

This is my much simpler representation

 --------------------
|                    |
| Proof              |
|                    |
| ------------------ |
||                  ||
|| Credential       ||
|| Metadata         ||
||                  ||
|| --------------   ||
|||              |  ||
||| Credential   |  ||
|||              |  ||
|| --------------   ||
| ------------------ |
 --------------------
mwherman2000 commented 1 year ago

This is my much simpler representation

Is this intended to be a representation/model of the current VCDM @David-Chadwick? ...or are you proposing a new representation/model?

OR13 commented 1 year ago

This is my much simpler representation

 --------------------
|                    |
| Signature          |
|                    |
| ------------------ |
||                  ||
|| JWT              ||
|| Header           ||
||                  ||
|| --------------   ||
||| JWT          |  ||
||| Claimset     |  ||
|||              |  ||
|| --------------   ||
| ------------------ |
 --------------------
SmithSamuelM commented 1 year ago

I think David's model is more generic and supports formats other than JWT. But the idea of both is correct IMHO. The proof in the outermost wrapper. Metadata about the container in the next inner wrapper which metadata is primarily to provide verifiability of the proof i.e. authentication. For example proof-of-issuance requires a mapping from proof to issuer. In the simplest form, the metadata is the issuer's identifier and the proof a signature using the controlling key-pair for that identifier. The payload is the innermost wrapper. The payload could be a claim-set in different serializations. Each wrapper is its own independent layer with its own data model. The payload could be an open world model without bleeding the complexity of such a model into the container or proof layers. Both models miss the nuance of an optional authorization sublayer as part of the metadata.

The main advantage is that the outer wrappers can be and should be closed world tightly specified data models that could more easily support implementations of multiple serializations and proof types in a secure way. The payload could be free to provide a claim set that is pristine also using different data models. This would support adaptation to different tool stacks and maximize reuse of existing tooling while allowing innovation and evolution. The claim-set can refer to the same identifiers in the meta-data layer but these are used as facts in the payload layer and are not functionally used to provide authentication or authorization for the container layer. Proof of issuance is much easier to manage if its principals are not co-mingled in the payload layer.

David-Chadwick commented 1 year ago

@SmithSamuelM

"Metadata about the container in the next inner wrapper which metadata is primarily to provide verifiability of the proof i.e. authentication."

This is not the intention of my credential metadata. Rather it is to provide information about the credential, such as its validity period, its schema etc. i.e. its data about the claims and is independent of the proof. The proof must have its own metadata such as its validity period, algorithm id etc. I have included this metadata within the proof envelope, but we could split proof into two wrappers namely: proof metadata encapsulating the credential metadata, and signature which encapsulates everything. I think that in @OR13 's diagram the JWT header equates to my proof metadata, in which case we will have 4 boxes and not 3.

OR13 commented 1 year ago

Are we talking about removing or augmenting the VC and VP claims set members?

David-Chadwick commented 1 year ago

@OR13 My intention is that all the current properties are assigned to one of the containers, so that it becomes clear whether they are part of the VCDM or the separate (and independent) proof specifications.

David-Chadwick commented 1 year ago

@mwherman2000 See my answer above to @OR13 . It is so that we can better understand the current VCDM properties and assign them to the relevant container.

SmithSamuelM commented 1 year ago

@David-Chadwick There are various ways to split the wrappers, which is the meat of a layered discussion. What goes in what layers and why.

In the example I gave above, I put the identifier of the issuer in the metadata that the proof is proving. It can’t go in the proof itself because we want the proof to make a cryptographic commitment to the issuer. This is essential for security reasons. A known issuer field makes the commitment of the proof (by the issuer) to itself as issuer explicit (unambiguous) and protects against various forms of attacks. Otherwise a bare signature as proof for example is just an endorsement that may have different semantics and may or may not be subject to replay attack.

We have a very limited and rigid toolset from cryptography and best practices for using that toolset. What might make sense from an abstract data model perspective does not necessarily make sense from a cryptography perspective. The lack of crypto realism in the VC Data Model is one of the problems I am trying to fix with a layered model.

David-Chadwick commented 1 year ago

@SmithSamuelM I think we are starting to use the same terms to mean different things. What do you mean by proof? I explained what I mean: it is the cryptographic signature and the data needed to verify the cryptographic signature (the proof metadata). I believe it is the generalisation of @OR13 's signature and JWT header. I am not sure what you mean by proof. Also it would be nice if you could provide a similar diagram to Orie's and mine to show the layers that you are proposing (i.e. authn and authz and presentation).

SmithSamuelM commented 1 year ago

A signature is a proof. The data needed to verify the signature may or may not be part of the proof. The example I gave where the issuer is not part of the proof but part of the stuff being proven is an important distinction. The issuer is needed to verify the proof but it is part of the stuff being proven. This is where understanding the crypto is important.

SmithSamuelM commented 1 year ago

It depends on what is being proven where the items needed to verify the proof go.

SmithSamuelM commented 1 year ago

A proof is issuance is primarily proving a committment by the issuer to the items being proven. For security reasons the items being proven include the identifier of the issuer. Given the identifier of the issuer that may be all this is required to verify the proof, or they may need to be additional data such as a reference to a verifiable registry that holds the key state of the issuer's identifier. This additional data would be proof meta-data. as you mean.

My point is that its not simply a matter of broad classification, If we want interoperable security we need to understand each proof type. By proof type I mean what is being proven. The proof method or algorithm is also important.

The same layering proof, container, payload works for authentication, authorization, and presentation proofs. The real value comes from a container model that enables all the differnet proof types with one container model. If not then we can't leverage an issuance proof as an AND for a subsequent presentation proof. We end up having to create multiple containers.

One of the complications we have with VCs, is that unlike traditional protocol layering which is concerned with proving something about data in motion, VCs need to prove something about data at rest. This is where blind application of proof formats that were designed for data in motion may be misapplied.

SmithSamuelM commented 1 year ago

I will give two examples: The first example:

Authentication Layer as Identifier Security Overlay

That way the overlay works is that there is a mapping between an Identifier and the authoritative signing key pairs (pubic, private) for that identifier. For short we call this a mapping between an identifier and its key state. A verifier can use the mapping to lookup the pubic key from the key state and then use that public key to verify a signature made with the private key. Given this mapping we can construct a message container. The container has two parts. The first part is the identifier of the source of the message. This is metadata. The second part is an opaque payload that is the message data.

An authenticatable message consists of the message container and the attached signature. A verifier opens the container, pulls out the identifier, uses the overlay mapping to look up the public key and then verifies the signature. The source identifier MUST be inside the container and designated as the source identifier so that the signature by that same source ID non-reputably commits to the source as the source of the message. Otherwise, the attached signature could be for some other proof purpose besides authenticating the source of the message.

To make this mapping concrete, if the source (Issuer) identifier is a did, then the method name and method id from the did are sufficient to discover the mapping. A did resolver to did-doc lookup provides the mapping to the key state. So the only meta-data needed to enable verification of the signature is the did itself. the did method infrastructure that provides the mapping may be assumed. For other identifier types, that may not be the case.

Typically the authentication layer includes all the information that is necessary to its function. This means the layer includes the signature, the container, and the identifier inside the container. The payload is opaque to the authentication function and is part of the next layer up. Crypto requires that the signature be attached because a signature can only be applied to a serialization. IN this case it is a serialization of the container that contains the issuer identifier and the payload. So layering that separates an attached proof as a data wrapper is not the same type of layering that separates by function.

So an authentication layer properly includes all the information essential to its function which includes the attached proof.
Other data can be attached. For example the mapping may not be universally known and so metadata to find the resource to lookup the mapping in addition to the identifier of the source may need to be attached to the container. But if the overlay is properly contructed such look up (discovery) information does not need to be inside the container and included in the items being proven (as per my did method example above).

SmithSamuelM commented 1 year ago

The second example:

Authorization SubLayer

The first definition in the english dictionary for the word "credential" is "evidence of entitlement". An entitlement is a type of authorization. So we could call it an "entitlement" layer or more familiarly an "authorization layer" . An authorization layer is usually a sublayer to an authentication layer because the Authorizer and Issuer are the same party and the authorization is useless unless it is first authenticated to the authorizer/issuer.

An authorization as two parties, the authorizer and the authorizee (authorized party). A message that is an authorization needs another more essential metadata item to go inside the metadata in the container. This is the Authorizee Identifier. We could simplify the terminology and use Issuee for Authorizee since in general the only reason to have an Issuee is to convey some sort of entitlement from the Issuer/Authorizer. We also need metadata that specifies the type of entitlement. This is metadata that goes in the container, not the opaque payload. There are different ways of expressing the authorization type. But let's say its a field whose value is the authorization type. Other metadata fields for the authorization sublayer might include the expiration date. These fields must be in the container metadata for the authorization sublayer to the authentication layer and not in the payload. (the payload could contain them redundantly but it would be a layering violation for them to only be in the payload).

Now our container has three metadata items in addition to the opaque payload.

1) Required Issuer 2) Optional Issuee (when the message is conveying an entitlement then the Issuee is required) 3) Optional Authorization Type 3.1 Optional qualifier metadata on the authorization such as expiration or revocation

So we have two layers inside our container.

Now a proof of authorization is not very useful without a presentation layer by which the authorizee (Issuee) can prove to a third party that the Issuee indeed has the associated entitlement and ask the third party to grant whatever comes with that entitlement.

But a proof of presentation is just the party making the presentation proving that they indeed are the Issuee. This may only require a signature with a nonce by the Issuee at the time of presentation. The nonce is to prevent replay attacks. Or it may require some additional metadata for the presentation that is outside the Issued container but is signed by the Issuee as presenter. There may also be other metadata attached to the presentation container needed to look up the key state for the Issuee by the verifier. But if the Issuee identifier security overlay is constructed appropriately (analogously to the Issuer Identifier security overlay) then no additional metadata may be required and the Issuee identifier by itself is sufficient to initiate discovery of the mapping to its key state.

So we have a presentation container that necessarily includes the authN/AuthZ container and any metadata needed to secure the presentation type but does not need to include the issuance signature. This is because the verifier usually can verify the authN/AuthZ container separately without reducing the security of the presentation. Its less cumbersome than having the presenter also sign the signature of the Issuer not merely the container so issued. But in some uses cases it might be wanted. depends. Attached to the presentation container is the Issuee's signature and any additional metadata needed to look up the mapping between issuee identifier and its key state in order to verify the signature. But the Issuee identifier must be inside the authorization sublayer container to securely bind the authorization to the Issuee identifier.

Usually the presentation metadata that must be included inside the presentation container needed to secure the presentation type which container is ncessarily signed by the presenter is some linkage between the presenter as a natural person and the Issuee identifier. This might be a biometric. When the presentation is live then, there may not need be any presentation metadata that is in inside a presentation container. The presentation proof is completely attached such as a live challenge response that the presenter signs in addition to signing the Issuance.

The goal is to design the simplest possible containers that satisfy the essential functions for the functional layers of AuthN, AuthZ and Presentation for each type of allowed function (authN, AuthZ, Presentation).

As soon as we start to just throw data into a container we violate this simplicity and complicate interoperable security.

SmithSamuelM commented 1 year ago

Its a truism of security systems design that simpler is stronger

SmithSamuelM commented 1 year ago

A privacy-preserving presentation adds some complexity to the presentation layer and may add complexity to the associated authN/AuthZ layers.

SmithSamuelM commented 1 year ago

But nowhere in this layering design of the AuthN/AuthZ/Presentation layuers do we ever need or benefit from an open-world model. It would be worse than useless. It would be pathological. The only place where an open world model may be beneficial depending on the use case is in the opaque payload. Indeed, because every bit of information essential to the functions of authentication, authorization, and presentation is provided either in their respective containers or in their respective attachments, the payload is truly opaque and can be represented compactly inside the container and securely committed to by a cryptographic hash of its contents. The data underlying the hash can be cached and transmitted out-of-band. Which has performance, privacy, confidentiality, and security advantages.

SmithSamuelM commented 1 year ago

To summarize, where we differ is that I define a layer as a function that includes the data or information needed to perform that function not as a specific wrapper. This is because each layer as a function requires a proof and those proofs are attached to or are proving other data needed by that layer. So the wrappers you use above don't represent the functional layers. IMHO functional layers better guide our design choices and allow for the intricacies of the associated crypto. This is not possible with merely data wrappers as layers. That is too simplistic a view and misses the most important design trade-offs.

David-Chadwick commented 1 year ago

I don't think our models are actually that different. Your container contains two parts, the metadata needed to validate the signature and the opaque payload. This equates to the signature and proof metadata of my 4 layer model.

 ----------------------
|                      |
|  Signature           |
|                      |
| -------------------- |
||                    ||
|| Proof metadata     ||
||                    ||
|| ------------------ ||
|||                  |||
||| Credential       |||
||| Metadata         |||
|||                  |||
||| -------------    |||
||||             |   |||
|||| Credential  |   |||
||||             |   |||
||| -------------    |||
|||                  |||
|| ------------------ ||
||                    ||
| -------------------- |
|                      |
  ---------------------

The fact that I have Credential Metadata and Credential inside the proof metadata is irrelevant to you as they are both the opaque payload of the proof layer from your perspective. I am simply modelling the two different types of property in the VCDM. But from your perspective both of these are the opaque payload.

We can then take the entire structure above as an opaque blob for the payload of another wrapping of proof metadata and signature (which is what we do in JWT-VC when we put the VC-JWT into a VP .

David-Chadwick commented 1 year ago

I do not agree with your characterisation of the authz layer because the Authorizer and Issuer are not always the same party. In RBAC and ABAC one party is the Issuer of the credential, and another party is the authorizer (i.e. the resource holder). When you travel abroad you use this model, as the US is the issuer of your passport and the UK border control is the authorizer of allowing you to enter our country.

David-Chadwick commented 1 year ago

I also think that once you introduce the authz sub layer into your container, this equates to the credential metadata of my model. If you are only interested in authentication, then the signature on its own (with the proof metadata) is sufficient for this. But if you now want to use this authenticated container for authz as well, then you need the credential metadata to tell the verifier something about authz being provided by the actual credential. For example, if the credential is your name, dob, and nationality (as in a passport), the credential metadata would contain the validity period of this data, along with a unique id number and type. The verifier, on seeing this type is "P" knows it is a passport type of authorisation.

SmithSamuelM commented 1 year ago

@David-Chadwick "and another party is the authorizer (i.e. the resource holder). " That is not how typically the term authorizer is used. The resource holder is the presenter is the authorizee. In a delegation then the delegatee can in turn become the delegator (authorizer) but that subsequent delegation requires another issuance. So we definitely have some terminology definitions to iron out. Now in federated identity systems or closed loop systems like access control systems (ABAC etc) or (object capabilities) (which none of which are not talking about here) you can have a more complicated authorization model where the identity provider issues a token that is a bearer authorization and the presenter/bearer of the token is sometimes viewed as the authorizer for a subsequent action (which is really a type of delegation). But the terminology becomes ambiguous when you mix in federated identitiy concepts or access control concepts. Lets not do that here.

Let me be clear, when I say that a VC is an entitlement which is a type of authorization that does not mean it is the same type of authorization that is implied in Access control systems or object capability systems. Think of entitlement as a license not an authorization token. If you are confusing the Issuee of a VC as the authorizer in a access control sense then we will not be able to have any kind of mutual understanding. They are not the same class of authorization, even remotely.

I can go into length at the architectural differences. The easiest way to understand those conceptual differences is to view access control as a closed loop authorization system where the issuer is also the verifier and the token (resource) embues the holder with the authority to "authorize" an action that is verified by the issuer. Typically the token is a bearer token (i.e. has no designated issuee). Whoever holds the token has the authority. But this is why access control tokens are so short lived because they are fundamentially insecure, any bearer will do. So they have to expire rapidly to prevent replay attacks.

Whereas the Issuer/Holder/Verifier model of VCs treats authorizations as an open loop system where the verifier is never the issuer and the Issuee must be explicitly designated in the authorization otherwise the verifier can't verify that the Issuee has the authority given to it via an authorization locked to that Issuee. This allows the VC to be long lived and have dynamic revocation as well as privacy (as opposed to closed loop which requires verification to phone home to the Issuer to validate the authorization). A verifiable data registry is required in an open loop system with long lived but dynamically revocable credentials to maintain the privacy separation between Issuer and Verifier.

But I get why the VC data model gets confused because very few have thought about it as a new type of authorization and confuse it with other types of authorization such as access control.

The goal should be interoperable security first. This means having precisely defined security models and not confusing authorizations as long lived entitlements in a open loop model with short lived access control tokens in a closed loop model.

If it would help I would be fine with using the term “entitlement layer” instead of authorization layer.

Hope that helps.

mwherman2000 commented 1 year ago

The fact that I have Credential Metadata and Credential inside the proof metadata is irrelevant to you as they are both the opaque payload of the proof layer from your perspective.

@David-Chadwick Using the current VCDM as a guide, a) what do you consider to be the Credential? b) where is the dividing lines/boundaries of the Credential? c) do you, for example, consider the value of credentialSubject to be the (boundaries of the) Credential? ...or something more/less?

SmithSamuelM commented 1 year ago

We had a similar issue come up in the DID specification and the DID resolver specification. The DID resolver and DID specs are not independent of each other. There are clear dependencies between the two. This meant revising both to reflect those dependences. Likewise a proper layer design can’t treat the proof specification as completely and totally independent because a functionally secure proof of Issuance/Authentication is dependent on metadata in the containers that must be committed to by the proof. So we should revise the proof specification similarly to reflect those necessary dependencies. This is another layer violation (treating the proof as independent of the data model for the containers)

David-Chadwick commented 1 year ago

@mwherman2000 c) is the credential (ie. the credential subject object and its properties, sometimes called claims). All the other properties are metadata about the credential subject's properties.

mwherman2000 commented 1 year ago

@David-Chadwick Thank you for the clarification. Precision is important.

All the other properties are metadata about the credential subject's properties.

Now regarding the "the other properties are metadata about the credential subject's properties", I believe "the other properties" need to be divided or separated into 2 types (maybe more): 1) Part 2 of "the other properties" is metadata about the Credential (defined to be the value of credentialSubject - the definition of Credential we agreed to above) 2) and then there's Part 1 of "the other properties" is metadata about the larger structure: the verifiable credential structure exclusive of the value of the credentialSubject.

There may even be Parts 3, 4, 5, ... for describing other aspects of either the Credential or the Verified Credential (exclusive of the Credential) ...I think this is maybe where @SmithSamuelM is coming from?

David-Chadwick commented 1 year ago

@mwherman2000 I am not sure what your Part1 and Part2 refer to, but anything to do with proofing the credential (i.e. to do with the verifiable credential) is not metadata about the credential. It is metadata about the proof. This is what my diagram is trying to show. So I agree that there are two sorts of metadata, but they need to be kept separate and not interwoven. When you remove the proof from the verifiable credential (to leave the credential and it's metadata) you remove both the signature and the proof metadata. You can then add a completely different proof to the credential (and its metadata). In this way you can convert a JWT-VC into a LD-Proofed VC, (and back again) whilst the underlying credential (and its metadata) would not change.

David-Chadwick commented 1 year ago

@SmithSamuelM

"Likewise a proper layer design can’t treat the proof specification as completely and totally independent because a functionally secure proof of Issuance/Authentication is dependent on metadata in the containers that must be committed to by the proof."

Perhaps this is where we disagree. At the technical level I believe that signing/authenticating information can treat that information as a blob and the signature function should be independent of the blob. i.e. the signing function should be able to sign any blob. However at the semantic/administrative level clearly the signer will need to know what they are signing in order to authenticate it. (E.g. if the blob says "I authorise you to empty my bank account" I would not be willing to sign it. However the technical infrastructure is perfectly able to sign it).

I could argue that trying to intertwine the semantic/administrative level with technical/syntactic level over complicates the system and that keeping them separate simplifies the overall system. And as you said "Its a truism of security systems design that simpler is stronger".

SmithSamuelM commented 1 year ago

@David-Chadwick As Einstein is reputed to have said “ Everything should be made as simple as possible, but not simpler.”. For an authorization/entitlement/credential. the target of the entitlement, i.e. the entitlee, credentialee, authorizee MUST be part the the data being proven by the Issuer otherwise there is no way for the target to proof at the time of presentation that they indeed are the intended (by the Issuer) recipient of the entitlement/authorization/credential. So while simple is stronger is a truism, in this case leaving out the identifier of the target is “too simple” because without it the proof of Issuance no longer secures the intended target. All that is needed is the cryptonym of the target, nothing else. This is “minimally sufficient means” which is the design aesthetic that captures both simple is better but not too simple. So multiple proof types from the designated issuer ( recall that the Issuer ID is also in the data being proven) are still enabled. And a group identifier for the issuer allows multiple group members to provide independent proofs. The key is to understand the function and not try to do some overly simplistic separation of semantic vs syntatic. The semantics of authorization are essential to an authorization sublayer. But only the semantics of authorization. The problem is not separation of semantics from syntax. Its what semantics and what syntax are appropriate for each layer. And not mixing either syntax or semantics from other layers.

Note thqt the issuee identifier is using PKI. The Issuee identifier is public is its derived from public keys from a public/priate key pairs and is therefore controlled by the private keys. Therefore only the Issuee (ie.e the entity that controls the private keys) can proof that they indeed are the Issuee. This is essential to the Issuer/Issuee/Verifier model of open loop authorization. We are NOT talking about bearer authorization tokens endemic to closed loop authorization systems where the verifier MUSt either be the Issuer or affiliated (same trust domaine) with the Issuer or federated with the Issuer. A bearer token means that any verifier of the authz token can now impersonate the presenter of that token. These tokens are inherently replayable and therefore depend on other mechanims to mitigate such attacks. An open loop model on the other hand depends on the inherent properties of PKI, only the entity who controls the private keys can make a verifiable presentation where they successfully purport to be the target of the authorization (i.e. the entitlee, credentialee, authorizee).

Without this we do not have verifiable credentials.

Despite the best efforts of some community members, the essential self-certifying (decntralized) PKI underpinnings of a verifiable statement where there is an Issuer, on optional Issuee, and a Verifier, by whatever name we call them (Verifiable Repute, Verifiable Container, Verifiable Claim, Verifiable Credential) have not changed since the original Open Reputation white paper that I wrote in 2015 that defined the construct.

https://github.com/SmithSamuelM/Papers/blob/master/whitepapers/open-reputation-low-level-whitepaper.pdf