edi3 / edi3-json-ld-ndr

GNU General Public License v3.0
0 stars 2 forks source link

Enriched machine-readable CEFACT metadata for edi3 vocabulary #12

Open Fak3 opened 3 years ago

Fak3 commented 3 years ago

CEFACT RDM Business Information Elements (classes and properties) carry multiple metadata fields used for organizational purposes, for ex. Dictionary Entry Name like Referenced_ Transport Means. Type. Code or usage rules.

This ticket suggests a way to port cefact metadata to RDF \ json-ld to be included in edi3 vocabulary in machine-readable way. It extends the idea proposed previously in #8, adding http URLs to the cefact BIEs so that human can use web browser to look up the additional metadata and description associated with particular BIE. This metadata may be useful for existing CEFACT RDM users to embrace Linked Data concepts and aid interoperability of existing systems with Linked Data world. This metadata may not be as useful for new implementors, so they could safely ignore it.

In short, we could add RDF Classes to represent cefact BIEs, associate metadata with them, and associate these BIEs with edi3 vocabulary RDF classes and properties.

With ideas of this ticket applied, the property definition in the edi3 vocabulary will look like this:

{
  "@id": "edi3:consignor",
  "rdfs:type": "rdfs:Property",
  "rdfs:domain": "edi3:Consignment",
  "rdfs:range": "edi3:Party",
  "edi3:cefactElementMetadata": [
    {
      "@id": "cefact:UN01011054",
      "@type": "edi3:AssociationBIE", 
      "edi3:cefactDictEntryName": "Referenced_ Supply Chain_ Consignment. Consignor. Trade_ Party",
      "edi3:cefactBusinessProcess": "Buy-Ship-Pay"
    },
    {
      "@id": "cefact:UN01004212",
      "@type": "edi3:AssociationBIE", 
      "edi3:cefactDictEntryName": "Supply Chain_ Consignment. Consignor. Trade_ Party",
      "edi3:cefactBusinessProcess": "Buy-Ship-Pay"
    },
  ]
}

Classes for Business Information Elements

Add following RDF classes to represent BIEs: edi3:AggregateBIE, edi3:BasicBIE, edi3:AssociationBIE, and base class for them: edi3:CefactBIE. Metadata properties that may be used to describe instances of these classes is described below.

Linking BIEs with edi3 concepts

As the example above demostrates, I suggest to add edi3:cefactElementMetadata property to associate BIE with edi3 properties\classes. It has a range of edi3:CefactBIE

{
  "@id": "edi3:cefactElementMetadata",
  "@type": "rdfs:Property",
  "schema:domainIncludes": ["rdfs:Class", "rdfs:Property"]
  "rdfs:range": "edi3:CefactBIE"
}

Machine-readable BIE graph

It was proposed in #8 to add custom property edi3:cefactID. I suggest to remove this property, and mint http url instead. This url should become the primary identifier for instances of BIE, so that json-ld reserved "@id" property will be used in machine-readable representation, as example at the beginning of this ticket demonstrates.

When this url is dereferenced with web browser, the html documentation page should be returned, similar to what we have now at https://edi3.org/vocab/. When this url is dereferenced with http header accept:application\json+ld, then a machine-readable representation of the BIEs and their metadata should be returned, in flattened graph json-ld form. This machine-readable data is not a vocabulary of rdfs concepts (in contrast to edi3 vocabulary) - the BIEs are instances of edi3:CefactBIE with metadata associated, and are not rdfs:Classes or Properties themselves.

{
  "@context": {
      "linkedEdi3Class": {
        "@reverse": "edi3:cefactElementMetadata",
        "@type": "@id"
      },
      "linkedEdi3Property": {
        "@reverse": "edi3:cefactElementMetadata",
        "@type": "@id"
      },
   },
  "@graph": [  
    {
      "@id": "cefact:UN01011054",
      "@type": "edi3:AssociationBIE", 
      "edi3:cefactDictEntryName": "Referenced_ Supply Chain_ Consignment. Consignor. Trade_ Party",
      "edi3:cefactBusinessProcess": "Buy-Ship-Pay",
      "linkedEdi3Property": "edi3:consignor"
    },
    {
      "@id": "cefact:UN01004212",
      // ...
    }, 
    // ...
]

I have previously proposed in #9 to create csv or json mapping from cefact BIEs to the edi3 vocab concepts, but here the more LD-native way of linking can be used: as can be seen in the example above, I used "linkedEdi3Class" and "linkedEdi3Property" as reverse-links of edi3:cefactElementMetadata (which was described before)

When CEFACT publishes new version of RDM xls file, it should be parsed to generate new version of this machine-readable BIE graph, and html version be regenerated.

The http namespace for the BIE primary identifier should probably be separate from the namespace of the edi3 concepts vocabulary, to make more explicit and notable separation between dynamic RDM concepts and more persistent edi3 rdfs vocabualry. In this ticket examples I have used cefact namespace prefix. In json-ld context it would be associated with full http url: "cefact": "https://unece.org/cefact/informationElements#". Where exactly this url should be based, and how the regenration of machine-readable and html representation of BIE graph should be governed is to be discussed.

BIE metadata properties

These are properties with rdfs:domain of edi3:CefactBIE, carrying actual metadata.

edi3:cefactDictEntryName

This is column D (4) in the xls file. It can serve as human-readable identifier. In the vocab that we have now (#4) it is placed in "rdfs:label", which is a mistake, we will have to move it when we start dedupication of BSP classes and properties described in #9

{
  "@id": "edi3:cefactDictEntryName",
  "@type": "rdfs:Property",
  "rdfs:domain": "eid3:CefactBIE",
  "rdfs:range": "xsd:string"
}

edi3:cefactBusinessProcess

This is column V (21) in the xls file. Not sure how useful it is, should consult with CEFACT experts. Has values like "In All Contexts", "Buy-Ship-Pay", "Supply Chain", "Trade"

edi3:cefactUsageRule

This is column R (18) in the xls file. It has some usage comments, eg: "This business term is used in the standardised French postal address"

onthebreeze commented 3 years ago

I like it

Fak3 commented 3 years ago

As discussed in slack, proposal can be improved further:

So that the example vocabulary looks like this:

{
  "@id": "edi3:consignor",
  "rdfs:type": "rdfs:Property",
  "rdfs:domain": "edi3:Consignment",
  "rdfs:range": "edi3:Party",
  "edi3:cefactElementMetadata": [
    {
      "@id": "cefact:Referenced_SupplyChain_Consignment.Consignor.Trade_Party",
      "@type": "edi3:AssociationBIE", 
      "edi3:cefactUNId": "cefact:UN01011054",
      "edi3:cefactBieDomainClass": "cefact:Referenced_SupplyChain_Consignment.Details",
      "edi3:cefactBusinessProcess": "Buy-Ship-Pay"
    },
    {
      "@id": "cefact:SupplyChain_Consignment.Consignor.Trade_Party",
      "@type": "edi3:AssociationBIE", 
      "edi3:cefactUNId": "cefact:UN01004212",
      "edi3:cefactBieDomainClass": "cefact:SupplyChain_Consignment.Details",
      "edi3:cefactBusinessProcess": "Buy-Ship-Pay"
    },
  ]
}