w3c-ccg / traceability-vocab

A traceability vocabulary for describing relevant Verifiable Credentials and their contents.
https://w3id.org/traceability
Other
34 stars 35 forks source link

string IDs vs links to external data; external JSONLD payload #290

Closed VladimirAlexiev closed 10 months ago

VladimirAlexiev commented 2 years ago

You use globalLocationNumber and gtin, good (and in #279 I asked to add ability to handle SGTIN, LGTIN , etc.

But EPCIS events ask these to be represented as URIs, see

https://github.com/gs1/EPCIS/blob/master/epcis-context.jsonld

    "epcList": {
      "@id": "epcis:epcList",
      "@type": "@id",
      "@container": "@set"
    },
    "bizLocation": {
      "@id": "epcis:bizLocation",
      "@type": "@id"
    },
    "readPoint": {
      "@id": "epcis:readPoint",
      "@type": "@id"
    },

https://github.com/gs1/EPCIS/blob/master/Ontology/EPCIS.ttl

epcis:epcList  a          owl:ObjectProperty , rdf:Property ;
        rdfs:comment      "(Optional) An unordered list of one or more EPCs naming specific objects to which the event pertained."@en ;
        rdfs:domain       [a owl:Class ;
                           owl:unionOf (epcis:ObjectEvent epcis:TransactionEvent)] ;
        schema:domainIncludes epcis:ObjectEvent, epcis:TransactionEvent;
        rdfs:isDefinedBy  epcis: ;
        rdfs:label        "epcList" ;
        rdfs:range        gs1:IndividualObject ;
        schema:rangeIncludes gs1:IndividualObject ;
        sw:term_status    "stable" .

epcis:bizLocation  a      owl:ObjectProperty , rdf:Property ;
        rdfs:comment      "(Optional) The business location where the objects associated with the EPCs may be found, until contradicted by a subsequent event."@en ;
        rdfs:domain       epcis:EPCISEvent ;
        schema:domainIncludes epcis:EPCISEvent ;
        rdfs:isDefinedBy  epcis: ;
        rdfs:label        "bizLocation" ;
        rdfs:range        gs1:Place ;
        schema:rangeIncludes gs1:Place ;
        sw:term_status    "stable" .

epcis:readPoint  a        owl:ObjectProperty , rdf:Property ;
        rdfs:comment      "(Optional) The read point at which the event took place."@en ;
        rdfs:domain       epcis:EPCISEvent ;
        schema:domainIncludes epcis:EPCISEvent ;
        rdfs:isDefinedBy  epcis: ;
        rdfs:label        "readPoint" ;
        rdfs:range        gs1:Place ;
        schema:rangeIncludes gs1:Place ;
        sw:term_status    "stable" .

(The class gs1:IndividualObject is not yet defined in gs1: voc; it includes products with serial=SGTIN, containers/pallets=GRAI, assets=GIAI etc. A class hierarchy will be added according to the gsheet mapping table mentioned in #284)

In case they are not URNs but URLs, they must be Digital Link URLs, eg https://id.gs1.org/gtin/9506000134352) they can return info about that product, place, etc.

So:

VladimirAlexiev commented 2 years ago

@nissimsan

we typically use relatedLink for URLs.

  1. But that's overly generic since it makes an intermediate node and uses a separate prop for the link type
  2. It doesn't expect the target URL to be a resource in this graph (I.e. to have triples): schema:Role is used for that.
nissimsan commented 2 years ago

... Wasn't this another issue?

VladimirAlexiev commented 2 years ago

Here's how LinkML does it: https://linkml.io/linkml/schemas/inlining.html.

msporny commented 2 years ago

Here's how LinkML does it

I will note that referencing external entities without a cryptographic hash of some sort doesn't guarantee that what you're linking to won't change. I don't know if this matters for the Traceability folks, but it is certainly an important thing to consider.

There are specs intended to address that concern:

https://datatracker.ietf.org/doc/html/draft-sporny-hashlink-07

VladimirAlexiev commented 2 years ago

how can that external data be incorporated, controlled with a JSON schema, and verified with Verifiable Credentials? Is it even a good idea? certainly an important thing to consider.

Certainly seems important! I'm not fluent enough in Traceability use cases to figure out what is needed, and what should be signed.

@msporny Thanks, I knew about hashlink before: I guess it answers some of my question but I'd guess that's not the whole story.

And there's a more troubling consideration: if <did:web:ontotext.com> is (a semantic URL representing) a company, then what is <did:web:ontotext.com?hl=123456> ?

My oh my. It seems to me there are some conceptual questions to resolve between:

Maybe we should trust the publisher of the master data and take his updates uncritically (as soon as they are signed by him). Some illustrative example (totally wrong in terms of vc-model but hopefully understandable):

Couple years ago:

graph <2020-02-01> {
  <did:web:ontotext.com> a schema:Organization; schema:name "Ontotext".
  vc:issuer did:web:ontotext.com;
  schema:date "2020-02-01";
  vc:proof <proof1>
}

Now:

graph <2022-02-01> {
  <did:web:ontotext.com> a schema:Organization; schema:name "Sirma AI (doing business as Ontotext)"/
  vc:issuer did:web:ontotext.com;
  schema:date "2022-02-01";
  vc:proof <proof2>
}

I can save both of these responses and if need be, demonstrate them in the future: "See judge, that was the name they told me back then".

Rather than take the data once, attach a ?hl= hashcode, and fail when the data changes in the future.

If you consider the signed company data as a sort of Verificable Credential, then my scenario puts the burden of keeping historic data copies on the Holder.

Rather than using a hashcode to "freeze the data in time"; and expect the Issuer to keep every historic version to be able to re-serve it again?

nissimsan commented 2 years ago

We will pick this up when we dive deeper into GS1 modelling.

VladimirAlexiev commented 2 years ago

@nissimsan @OR13 But please note this issue is not limited to GLN or other GS1 identifiers.

Every time you have a JSONLD sub-object or a textual ID, you could have a URL. That URL can carry semantic data in a variety of places:

  1. in the same JSON position, or
  2. in another position (top level or another sub-object)
  3. in external payload from referencing the URL

How can JSON Schema handle this variety? All your current schemas favor 1 (embedded subjects), but that facilitates data copying not data sharing.

And copied data is outdated data.

(Now I see #280 where this is posted explicitly)

nissimsan commented 2 years ago

Noting that hashlinks have made their way into some schemas. For example: https://github.com/w3c-ccg/traceability-vocab/blob/main/docs/openapi/components/schemas/common/CommissionEvent.yml#L43

Pinging @mkhraisha who's been driving these additions.

brownoxford commented 1 year ago

Assigning to @mkhraisha for review until next week.

nissimsan commented 1 year ago

@mkhraisha, status on this?

nissimsan commented 1 year ago

@mkhraisha, status on this?

OR13 commented 1 year ago

I suggest pending close.

nissimsan commented 1 year ago

Marking pending close because path to closure is unclear. But @mkhraisha , please action if you can.