Add optional proof section to TDs

mmccool commented 3 years ago

Add an optional "proof" section following https://w3c-ccg.github.io/ld-proofs/ to ensure the integrity of TDs, as discussed in the security session in the last F2F. This is the same mechanism used in DID documents.

Update: recently rebooted, see comments (and draft proposal...) below. New proposal will not be normatively based on LD-PROOFS but will be "similar".

mmccool commented 3 years ago

We will also be including the "proofChain" section, as specified in ld-proofs. This is important for directories and proxies, since if they modify the TD for some reason they have to update the signature, or if another entity also wants to sign it (eg the user or redistributor). So actually this PR will add two new sections, an optional "proof" section and an optional "proofChain" section. To check: it seems that only one of "proof" or "proofChain" should be used. The first (most recent) element in the proofChain is the "current" proof. Also note that "proof" can either be a single element or an array, and if more than one element it is considered a "proof set" where there are multiple proofs from different entities, and order is not relevant. The ld-proofs vocabulary should be included in the base vocabulary for TD 1.1. For TD 1.0, an extension context (https://www.w3.org/2018/credentials/v1, suggested prefix "cr") must be used.

mmccool commented 3 years ago

A discussion of JWS and canonicalization requirements was added to wot-profiles recently: https://github.com/w3c/wot-profile/issues/55 However, I'm still thinking the discussion should be under the TD spec. Summary (see Arch minutes here) people want to proceed with JWS but a TD canonicalization mechanism is needed. We did find a JSON canonicalization reference and the idea is to add a few other requirements for TDs (default values, etc) but we probably also need to deal with some JSON-LD issues (e.g. use of prefixes as opposed to URL expansions, etc) which may affect systems that process RDF internally (eg SPARQL-based directories).

farshidtz commented 3 years ago

Would it be possible to sign parts of the object and list the signed fields to allow partial verification?

e.g. proof.fields = ["id", "properties", "links"]

This is going to be very useful, as other entities can annotate a TD while keeping the original TD verifiable. Examples:

A proxy may add additional security or forms
Directory may add geospatial or other registration attributes for book keeping
It can also be used in proof chains and proof sets to sign different parts of the object.

mmccool commented 3 years ago

Rebooting. Want to define a signing mechanism. Following are some notes on requirements arising from security TF call on May 3, 2020.

Directories returning signed TDs are basically using object security, and we should set up the TDs that way. Figuring this out will be useful for other Things that want to use object security.
Directories want to add metadata to TDs, including signed TDs. New metadata should (*) use unique prefixes, so we could limit signatures to sets of vocabularies to avoid the new data breaking an existing signature. Proofs should also be chainable, so a new signature could encapsulate the original. Following the LD-PROOFS pattern, the signatures could be in an ordered sequence.
Canonical form is now supported, and might make signatures (based on the serialization) more robust, but IMO can't be 100% trusted yet. So I'm planning to propose that directories support a mechanism to return the original string submitted to them upon request. This can be implemented using a parallel key-value store indexed by the registration ID.
Algorithm for the signatures to be based on JWS. This supports a set of algorithms; should we pick one? Should the signature block indicate which ones are used?

Questions/Problems:

One problem is anonymous TDs, where directories might add a blank node to the id. If this is using the base vocab it breaks this model...
Possibly use tree/dag rather than a chain. Example: proxy modifies URLs, directory adds metadata, have TDs with either or both.
Maybe look at XML signatures rather than JWS. Format of signature object may be more appropriate. Issues of signing different objects, stapling. In our context, how to identify "parts"? JSON pointers could be used but managing and updating them would be a pain. Prefixes (different extension vocabularies) are another option.
Useful to be able to refer to signatures from elsewhere, e.g. to put into a blockchain.
How does this relate to DIDs, VC, etc.
Some IDs (e.g. DIDs) include signatures.
To validate signature also need to know public key it was signed with.

Proposal 1: Sequence (Chain)

    "signatures": [
        {  
           "signature": "......",
           "context": ["directory"]
        },
        {
           "signature": "......",
           "alg": ..., // optional, with default
           "context": ["","iot"],  // "" means base (normative) TD vocab
        }
    ]

Comments:

Similar to LD-PROOFS
If we want to refer to signatures, can use JSON pointer, but then should put new ones on the bottom.

Proposal 2: DAG

    "signatureDefinitions": {
        "directory": {  
           "signature": "......",
           "context": ["directory"],
            "chain": ["final"]
        },
        "final": {
           "signature": "......",
            "chain": ["base","annotation"]
        }
        "annotation": {
           "signature": "......",
           "alg": ..., // optional, with default
           "context": ["iot"]  
       },
        "base": {
           "signature": "......",
           "alg": ..., // optional, with default
           "context": [""]  // "" means base (normative) TD vocab
        }
    },
    "signature": "directory",

Comments:

Use complete context URL rather than prefixes. Prefixes are more convenient though, and canonical form does preserve prefixes, so...
Need to disallow cycles
How do we know which signature applies to the current TD? Could use definitions/use like above.

farshidtz commented 3 years ago

New metadata should (*) use unique prefixes, so we could limit signatures to sets of vocabularies to avoid the new data breaking an existing signature.

I'm not an expert, but I don't think this is a JSON-LD friendly approach. JSON-LD allows adding aliases and enforcing the use of namespace prefixes is against it. In other words, the following are equivalent:

{
  "@context": "http://schema.org/",
  "name": "Jane Doe"
}

{
  "@context": {
    "schema": "http://schema.org/",
    "name": "schema:name"
  },
  "name": "Jane Doe"
}

{
  "@context": {
    "schema": "http://schema.org/"
  },
  "schema:name": "Jane Doe"
}

{
  "@context": {
    "name": "http://schema.org/name"
  },
  "name": "Jane Doe"
}

and all normalized to:

_:c14n0 <http://schema.org/name> "Jane Doe" .

As I suggested in my https://github.com/w3c/wot-thing-description/issues/940#issuecomment-816570048 above, it is better to sign based on key names, rather than prefixes. Prefixing is optional. That would also solve this issue:

One problem is anonymous TDs, where directories might add a blank node to the id. If this is using the base vocab it breaks this model...

The signer would simply exclude this from the list of signed fields and a second signer could include it.

mmccool commented 3 years ago

Regarding prefixes, note that canonicalization requires preservation of prefixes, and forbids adding aliases, etc. Yeah, it does not allow you to do everything in JSON-LD, and a round-tripper needs to keep track of prefixes, but that's the tradeoff to get stable signing.

BTW I am also thinking now "Proposal 1" with a simple chain makes the most sense. A DAG is fun but complicates things, and anyway you can just release a different TD with a different "history" so each file has a simple chain.

farshidtz commented 3 years ago

Regarding prefixes, note that canonicalization requires preservation of prefixes, and forbids adding aliases, etc. Yeah, it does not allow you to do everything in JSON-LD, and a round-tripper needs to keep track of prefixes, but that's the tradeoff to get stable signing.

I wasn't suggesting changing the prefixes of existing and canonicalized TDs. My point was that a producer can alias before canonicalization and signing. I don't think that is currently forbidden nor that it should be.

mmccool commented 3 years ago

If the producer aliases before signing they can just use the alias in the signature...

Citrullin commented 3 years ago

verificationMethod and type from LD-Proofs and DID are quite useful. Why don't we just go with the LD-Proof spec and add context as restriction to it?

If we go with the structure and properties from the LD-Proofs that would solve at least these points:

Useful to be able to refer to signatures from elsewhere, e.g. to put into a blockchain.

How does this relate to DIDs, VC, etc.

Some IDs (e.g. DIDs) include signatures.

To validate signature also need to know public key it was signed with.

Instead of having the context, we could also just refer the the properties on the TD itself. (I don't know a proper name for it, just call it restrict for now)

    "signatures": [
        {  
           "signature": "......",
           "restrict": ["", "properties.temperature", "actions"]
        },
        {
           "signature": "......",
           "alg": ..., 
           "restrict": ["forms"], 
        }
    ]

farshidtz commented 3 years ago

"restrict": ["forms"]

@Citrullin that's similar to what I'm suggesting above. And as I said, it also solves the mentioned issue regarding blank node identifiers added to anonymous TDs (TDs without user-defined IDs).

I still think that signing based on the prefix is not a good idea, because the producer (of initial TD or a third-party annotating it) must be able to add aliases. @mmccool said that this is not a problem. But I think it contradicts with this statement:

New metadata should (*) use unique prefixes, so we could limit signatures to sets of vocabularies to avoid the new data breaking an existing signature.

See this example, where the directory adds new metadata without prefixes. Or this where the directory returns the unsigned blank node identifier.

Citrullin commented 3 years ago

"restrict": ["forms"]

@Citrullin that's similar to what I'm suggesting above. And as I said, it also solves the mentioned issue regarding blank node identifiers added to anonymous TDs (TDs without user-defined IDs).

You are right, sorry. Should have scrolled up before :)

I still think that signing based on the prefix is not a good idea, because the producer (of initial TD or a third-party annotating it) must be able to add aliases. @mmccool said that this is not a problem. But I think it contradicts with this statement:

I agree. I also would prefer this method, even though it may require more data to be transfered.

See this example, where the directory adds new metadata without prefixes. Or this where the directory returns the unsigned blank node identifier.

It also removes flexibility. You have eventually to prefix so much, even parts where prefixes are not necessary (or even possible?). That even may result in more data than just naming the fields you want to sign.

I find @mmccool sequence structure good and simple to use. So, I would prefer to use it as the base. I don't see an argument to not use it.

mmccool commented 3 years ago

Regarding LD-Proofs, it was my intention to base our spec on that. Unfortunately it is only a CG Note and is not normative, also it also vague in some places (e.g. the signature types and use cases are open-ended). So we can't just cite it and use it. We have to define something "similar" and fully spec it.
Point taken about prefixes. Let me think about it, but you make good points.
There is actually a charter for a "Linked Data Signature WG", that has apparently been under review since March: https://github.com/w3c/strategy/issues/262. The intention is more or less to formalize the LD-PROOFS. So one issue is when they (finally) some out with something, anything we come up with now will become obsolete. So one option here is to NOT do this. BTW the charter also talks about a canonicalization spec for JSON-LD (which was a technical sticking point for a long time) but even so TD will need additional canonicalization requirements (due to things like default value)

mmccool commented 2 years ago

This should be deferred to TD 2.0. We should wait until JSON-LD signing and canonicalization are finalized, and base our solution on that. We may still need to do a little more work, for example to canonicalize default values in TDs, but we should both be aligned with whatever JSON-LD does and let them do most of the heavy lifting.

w3c / wot-thing-description

Add optional proof section to TDs #940