Sveino / Inst4CIM-KG

Instance of CIM Knowledge Graph
Apache License 2.0
5 stars 1 forks source link

URL policy about MAS and BASE #98

Open VladimirAlexiev opened 1 month ago

VladimirAlexiev commented 1 month ago

I think figuring out instance URLs, Models as named graphs, and how instances could be served, is an important topic.

Problem

Instances use relative URLs. but don't specify a BASE, which leads to "random" URLs depending on tool used.

BASE=MAS

One way is to set BASE based on the MAS. Eg (in turtle):

@base <http://www.Statnett.no/IGM/Nordic44_CGM#>.
<e2f56599-a78e-494f-8db3-c0b0bdab1d70> a cim:OperationalLimitSet

Causes instance URL <http://www.Statnett.no/IGM/Nordic44_CGM#e2f56599-a78e-494f-8db3-c0b0bdab1d70>

But there are questions (leading to what you may call a URL Policy):

BASE=MAS=model

Maybe the model URN and the MAS should be the same? Don't they represent one and the same thing, namely that set of triples? After consideration, that is not the case:

Use URNs for Instances

Another way is to reformat instance URIs to be urn:uuid: just like the model URN.

Sveino commented 3 weeks ago

if we can replace rdf:ID="_f1769b90-9aeb-11e5-91da-b8763fd99c5f" withrdf:ID="urn:uuid:f1769b90-9aeb-11e5-91da-b8763fd99c5f" there is no issue for validation. There is a bit more of challenged to userdf:about

The use of MAS URL is just that we do not need to a lookup table for creating the URL. ENTSO-E does have some power by defining the URL for the return RDF data will be very difficult. However, we (ENTSO-E) have required the domain:

  • The www.model4powersystem.eu domain intends at referencing the power system models, or Individual Grid Model (IGM) and Common Grid Model (CGM) as stated in the network code.

We have also decided to use https rather than http. What is not clear is if ENTSO-E will host this or it is hosted by individual entity. Most likely this would be a transition where some advance TSO is hosting their own. We can then replace it with their URL or we redirect.

griddigit-ci commented 3 weeks ago

I guess, this is another good discussion point. -We need to see where do we need base and where not now instance data does not have base and RDFS and SHACL has

VladimirAlexiev commented 3 weeks ago

I thought we already agreed to use BASE=MAS instead of urn:uuid

VladimirAlexiev commented 3 weeks ago

Answering considerations that @Sveino raised in https://github.com/Sveino/Inst4CIM-KG/issues/94#issuecomment-2437524390:

rdf:ID vs rdf:about... to tell the receiver that you should already have the object

But there's no such semantics in RDF (in fact you can't create a node in RDF, you create only triples). And you can't control a standard tool to emit one or the other (most use only rdf:about).

or we go for the way we would like it to be for JSON-LD

We'll use the same instance URNs in XML and JSON, right? I recommend URLs that some day might resolve, whereas URIs will never resolve.

with a base that is http://model4powersystem.eu/Statnett

Ok, but you need to change the MAS in the specific files. And let's use https not http.

do base need to be the same for all the releated instance files or could/should it be named graph

The base determines instance node URLs. The same PSR should have the same URL, regardless in how many models or profiles it appears. Named graphs group triples (so turn them into quads), for model management purposes: add/delete model, patch (through a DifferenceModel), validate.

(MAS is replaced with) <dcat:isVersionOf rdf:resource="https://energy.referencedata.eu/Model/Statnett-EQOP"/>

I dislike two things here:

dcterms:references is refering to the instance dataset that is a named graph. This could be: <dcterms:references rdf:resource="http://model4powersystem.eu/Statnett-EQCO/urn:uuid:99ae9f41-0a91-4d21-a483-7398c160da96"

I dislike two things here:

Sveino commented 3 weeks ago

We are not starting to have very much the same dialog on multiple issues. I do not know how to explain the transition of CIM syntax update. Let me try with this diagram:

image

We'll use the same instance URNs in XML and JSON, right? Not between CIM JSON-LD and CIM XML (2016), but between CIM JSON-LD and standard RDF/XML.

The current header information md:Model is not sufficient for the future. Rather then develop our own we would like to reuse DCAT. We are there for missing information in the current CIM XML based on IEC 61970-552:2016. This lead us back to the strategy decision in https://github.com/Sveino/Inst4CIM-KG/issues/116

The comment regarding urn:uuid was related to "_". In CIM JSON-LD we will most like support resolvable and non-resolvable. For non-resolvable we are just using urn:uuid: for resolvable we include base URL.

The use of https://model4powersystem.eu/Statnett/ , https://energy.referencedata.eu/Statnett/ or https://statnett.model4powersystem.eu/ is a bit depending on how we can deal with re-direct and access rights- I did not see that we need to make that decision now. However, in regards to JSON-LD instance data specification we need to understand This should not necessary be a topic. How the combination of resolve and non-resolvable will work. Remember that we might not be able to have all TSO ready at the same time. Is this a possible approached:

{
  "@context": {
    "base": "https://energy.referencedata.eu/Model/Statnett-EQOP/",
    "dcterms": "http://purl.org/dc/terms/"
  },
  "@graph": [
    {
      "@id": "urn:uuid:99ae9f41-0a91-4d21-a483-7398c160da96",
      "dcterms:identifier": "99ae9f41-0a91-4d21-a483-7398c160da96",
      "dcterms:description": "Example description of a non-resolvable resource",
      "dcterms:resolvableURL": {
        "@id": "https://energy.referencedata.eu/Model/Statnett-EQOP/urn:uuid:99ae9f41-0a91-4d21-a483-7398c160da96"
      }
    }
  ]
}

For CIM XML conversion I am in principle OK if we use BASE=MAS or urn:uuid. But the current MAS is not particularly good....

I implemented cim-trig.pl, which produces http://www.Statnett.no/IGM/Nordic44_CGM#_e2f56599-a78e-494f-8db3-c0b0bdab1d70 (I left the underscore as is). We have said that "_" is technical addition. It should be replaced by urn:uuid or urn:eic if we decided to use that in the URL.

(MAS is replaced with) This was a simplification that I regrated when I wrote it. Each instance file, including the DifferenceSet is a version of the abstact reference defined in dcat:isVersionOf. This is explained here: https://www.w3.org/TR/vocab-dcat-3/#ex-version-chain-and-hierarchy

dcterms:references I do not understand the comments on this. dcterms:references shall refer to the model/dataset that this dataset/graph is depending on for doing full validation. it should not be self-refering.

VladimirAlexiev commented 3 weeks ago

As discussed, I think it should be like this:

{
  "@context": {
    "base": "https://energy.referencedata.eu/Statnett/",
    "dcterms": "http://purl.org/dc/terms/"
  },
  "@graph": [
    {
      "@id": "99ae9f41-0a91-4d21-a483-7398c160da96", // URL resolved against BASE
      "dcterms:identifier": "99ae9f41-0a91-4d21-a483-7398c160da96", // string not URL!
      "dcterms:description": "Example description of a resolvable resource"
    }
  ]
}