3lbits / CIM4NoUtility

CIM for the Norwegian Power Utility
Creative Commons Attribution Share Alike 4.0 International
20 stars 7 forks source link

Converting CIMXML datatypes to CIMJSON-LD datatypes #278

Closed Sveino closed 1 year ago

Sveino commented 1 year ago

The general rule is that the instance file would need to confer to the Profile. If the profile constraints the unit and multiplier the instance file should not deviate from it. In the case that the profile allows for kW and MW it would be required that the instance file declare if it is kW or MW (and only those two are allowed. The agree serialization/syntax should not prevent this possibility.

We need to document how we shall convert the following CIMXML datatypes to CIMJSON-LD. This include (and not necessary exclusive) the following:

1. Primitives

The primitive are defines in the CIM profile as: image

<rdf:Description rdf:about="#Equipment.normallyInService">
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#DatatypeProperty" />
    <rdfs:label xml:lang="en">normallyInService</rdfs:label>
    <rdfs:domain rdf:resource="#Equipment"/>
    <rdfs:range rdf:resource = "http://www.w3.org/2001/XMLSchema#boolean" />
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#FunctionalProperty" />
    <skos:definition  xml:lang="en">Specifies the availability of the equipment under normal operating conditions. True means the equipment is available for topology processing, which determines if the equipment is energized or not. False means that the equipment is treated by network applications as if it is not in the model.</skos:definition>
</rdf:Description>

The declaration of the primitive are covered by the JSON language explained in Type-specific keywords.

2. CIMDatatype

The CIMDatatype are defines in the CIM profile as:

image There are multiple CIMDatatype but they are all following the same pattern as cim:Length.

<rdf:Description rdf:about = "#UnitSymbol.m">
    <skos:definition  xml:lang="en">Length in metres.</skos:definition>
    <eq:isenum>True</eq:isenum>
    <rdfs:label xml:lang = "en">m </rdfs:label>
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#NamedIndividual" />
    <rdfs:domain rdf:resource = "#UnitSymbol"/>
</rdf:Description>
<rdf:Description rdf:about="#UnitSymbol">
    <rdfs:label xml:lang="en">UnitSymbol</rdfs:label>
    <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
    <eq:Package>Package_CoreEquipmentProfile</eq:Package>
    <rdfs:subClassOf rdf:resource="#Enumeration"/>
    <owl:oneOf rdf:parseType="Collection">
        <owl:Thing rdf:about="#UnitSymbol.m"/>
    </owl:oneOf >
</rdf:Description>
<rdf:Description rdf:about="#Length">
    <rdfs:label xml:lang="en">Length</rdfs:label>
    <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
    <skos:definition  xml:lang="en">Unit of length. It shall be a positive value or zero.</skos:definition>
    <eq:isCIMDatatype>True</eq:isCIMDatatype>
    <eq:Package>Package_CoreEquipmentProfile</eq:Package>
</rdf:Description>
<rdf:Description rdf:about="#Length.value">
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#DatatypeProperty" />
    <rdfs:label xml:lang="en">value</rdfs:label>
    <rdfs:domain rdf:resource="#Length"/>
    <rdfs:range rdf:resource = "http://www.w3.org/2001/XMLSchema#float" />
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#FunctionalProperty" />
</rdf:Description>
<rdf:Description rdf:about="#Length.unit">
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#DatatypeProperty" />
    <rdfs:label xml:lang="en">unit</rdfs:label>
    <rdfs:domain rdf:resource="#Length"/>
    <rdfs:range rdf:resource = "#UnitSymbol" />
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#FunctionalProperty" />
    <rdf:value>m</rdf:value>
    <eq:isFixed>True </eq:isFixed>
</rdf:Description>
<rdf:Description rdf:about="#Length.multiplier">
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#DatatypeProperty" />
    <rdfs:label xml:lang="en">multiplier</rdfs:label>
    <rdfs:domain rdf:resource="#Length"/>
    <rdfs:range rdf:resource = "#UnitMultiplier" />
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#FunctionalProperty" />
    <rdf:value>k</rdf:value>
    <eq:isFixed>True </eq:isFixed>
</rdf:Description>

3. Compound

The Compound are defines in the CIM profile as:

image image

    "cim:Asset.lifecycleDate": {
        "cim:LifecycleDate.manufacturedDate": "2021-05-30T23:00:00Z", 
        "cim:LifecycleDate.purchaseDate": "2022-03-30T23:00:00Z"
    },

4. Enumerator

The Enumerator in a Datatype is handle in the same way as other enumerator.

Example

The solution is very much related to the profile definition described in the upcoming version of IEC 61970-501.

This is related to: https://github.com/CIMug-org/WG13InformationModel/issues/12

<?xml version='1.0' encoding='UTF-8'?>
<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:cim="http://iec.ch/TC57/CIM100#"
    xmlns:md="http://iec.ch/TC57/61970-552/ModelDescription/1#"
    xmlns:eu="http://iec.ch/TC57/CIM100-European#">

    <cim:ACLineSegment rdf:ID="_9d58e5bb-834c-4faa-928c-7da0bb1497d9">
        <cim:IdentifiedObject.mRID>9d58e5bb-834c-4faa-928c-7da0bb1497d9</cim:IdentifiedObject.mRID>
        <cim:IdentifiedObject.description>400V Telemarkstien 2 ACLineSegment 1</cim:IdentifiedObject.description>
        <cim:IdentifiedObject.name>04 TELEMA2 ACLS1</cim:IdentifiedObject.name>
        <cim:Equipment.aggregate>false</cim:Equipment.aggregate>
        <cim:Equipment.normallyInService>true</cim:Equipment.normallyInService>
        <cim:Conductor.length>50</cim:Conductor.length>
        <cim:ACLineSegment.bch>0</cim:ACLineSegment.bch>
        <cim:ACLineSegment.gch>0</cim:ACLineSegment.gch>
        <cim:ACLineSegment.r>0.015999999945951</cim:ACLineSegment.r>
        <cim:ACLineSegment.x>0.003769911221053</cim:ACLineSegment.x>
        <cim:PowerSystemResource.AssetDatasheet rdf:resource="#_d25f35ca-69fb-4c06-905b-67ec532e5f14" />
        <cim:Equipment.EquipmentContainer rdf:resource="#_64e21d95-8b55-b941-a923-b74087410a66" />
        <cim:ConductingEquipment.BaseVoltage rdf:resource="#_9598e4a0-67e5-4ad7-879c-c85a1f63159c" />
    </cim:ACLineSegment>
</rdf:RDF>
{
    "@context": {
        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
        "cim": "http://iec.ch/TC57/CIM100#",
        "md": "http://iec.ch/TC57/61970-552/ModelDescription/1#",
        "eu": "http://iec.ch/TC57/CIM100-European#"
    },

    "@graph": [
        {
            "@id": "urn:uuid:9d58e5bb-834c-4faa-928c-7da0bb1497d9",
            "@type": "cim:ACLineSegment",
            "cim:IdentifiedObject.description": "400V Telemarkstien 2 ACLineSegment 1",
            "cim:IdentifiedObject.name": "04 TELEMA2 ACLS1",
            "cim:Equipment.aggregate": false,
            "cim:Equipment.normallyInService": true,
            "cim:Conductor.length": {
                "cim:Length.value": 50.0
            },
            "cim:ACLineSegment.bch": {
                "cim:Susceptance.value": 0.0
            },
            "cim:ACLineSegment.gch": {
                "cim:Conductance.value": 0.0
            },
            "cim:ACLineSegment.r": {
                "cim:Resistance.value": 0.015999999945951
            },
            "cim:ACLineSegment.x": {
                "cim:Reactance.value": 0.003769911221053
            },
            "cim:PowerSystemResource.AssetDatasheet": {
                "@id": "urn:uuid:d25f35ca-69fb-4c06-905b-67ec532e5f14"
            },
            "cim:Equipment.EquipmentContainer": {
                "@id": "urn:uuid:64e21d95-8b55-b941-a923-b74087410a66"
            },
            "cim:ConductingEquipment.BaseVoltage": {
                "@id": "urn:uuid:9598e4a0-67e5-4ad7-879c-c85a1f63159c"
            }
        }
    ]
}
ThomasRanvikEriksen commented 1 year ago

Couldnt we have done something like this?

"cim:Conductor.length": {"@value": 50.0, "@type": "cim:Length", "unit": "m", "multiplier": "k"}

Sveino commented 1 year ago

Couldnt we have done something like this?

"cim:Conductor.length": {"@value": 50.0, "@type": "cim:Length", "unit": "m", "multiplier": "k"}

By adding "unit" and "multiplier" in the instance file, we indicate that they can be different from what the profile is stating. We should therefore only include it if the profile is not constraining the attribute. For the header that we do not have any profiles and that we are dealing with a complex datatype we can add multiple items to the same property. "@type": "cim:Length", should be given by "cim:Length.value", but I do not know if "@value": 50.0 gives necessary reference to "cim:Length.value". It might be possible to defined this if we are using JSON-LD describing the profile.

Sveino commented 1 year ago

We would like to associate the cim:Conductor.length value with QUDT meter (http://qudt.org/vocab/unit/M), but in CIM we have unit and multiplier that gives it. This mean we cannot use SKOS:exactMatch.

ThomasRanvikEriksen commented 1 year ago
{
    "@context": {
        "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
        "cim": "http://iec.ch/TC57/CIM100#",
        "md": "http://iec.ch/TC57/61970-552/ModelDescription/1#",
        "eu": "http://iec.ch/TC57/CIM100-European#",
        "dcterms": "http://purl.org/dc/terms/",
        "dcat": "http://www.w3.org/ns/dcat#",
        "prov": "http://www.w3.org/ns/prov#",
        "xsd": "http://www.w3.org/2001/XMLSchema#"
    },
    "@graph": [
        {
            "@id": "urn:uuid:9d58e5bb-834c-4faa-928c-7da0bb1497d9",
            "@type": "cim:ACLineSegment",
            "cim:IdentifiedObject.mRID": "9d58e5bb-834c-4faa-928c-7da0bb1497d9",
            "cim:IdentifiedObject.description": "400V Telemarkstien 2 ACLineSegment 1",
            "cim:IdentifiedObject.name": "04 TELEMA2 ACLS1",
            "cim:Equipment.aggregate": false,
            "cim:Equipment.normallyInService": true,
            "cim:Conductor.length": {
                "cim:Length.value": 50.0
            },
            "cim:ACLineSegment.bch": {
                "cim:Susceptance.value": 0.0
            },
            "cim:ACLineSegment.gch": {
                "cim:Conductance.value": 0.0
            },
            "cim:ACLineSegment.r": {
                "cim:Resistance.value": 0.015999999945951
            },
            "cim:ACLineSegment.x": {
                "cim:Reactance.value": 0.003769911221053
            },
            "cim:PowerSystemResource.AssetDatasheet": {
                "@id": "urn:uuid:d25f35ca-69fb-4c06-905b-67ec532e5f14"
            },
            "cim:Equipment.EquipmentContainer": {
                "@id": "urn:uuid:64e21d95-8b55-b941-a923-b74087410a66"
            },
            "cim:ConductingEquipment.BaseVoltage": {
                "@id": "urn:uuid:9598e4a0-67e5-4ad7-879c-c85a1f63159c"
            }
        }
    ]
}
VladimirAlexiev commented 1 year ago

@sveino Please always convert your examples to Turtle because it's easier to read, and can expose various errors.

There are some errors in your definitions:

In addition, the choice of xsd:float for numbers means that they can be freely converted to scientific notation, and inexact arithmetics. Are you happy when your numbers get converted to eg this:

    cim:ACLineSegment.r        1.5999999945951E-2 ;
    cim:ACLineSegment.x        3.769911221053E-3 ;

If the profile constraints the unit and multiplier the instance file should not deviate from it.

In that case do you want instance data to repeat these fixed unit and multiplier, or to omit them?

QUDT meter (http://qudt.org/vocab/unit/M), but in CIM we have unit and multiplier that gives it. This mean we cannot use SKOS:exactMatch.

True. Unfortunately CIM has taken upon itself the definition of UoM (rather than reusing established standards), and its approach to UoM is not quite good.

@ThomasRanvikEriksen

"cim:Conductor.length": {"@value": 50.0, "@type": "cim:Length", "unit": "m", "multiplier": "k"}

"@value" is used only with datatype properties (i.e. datatyped values), you cannot use it on an object class.

With the current approach of repeating symbol and multiplier for every instance, you'd have to do something like this. I think it's not quite easy to use:

"cim:Conductor.length": {
  "@type": "cim:Length", 
  "cim:Length.value":      {"@type":"xsd:float", "@value": "50.0"}, # just 50.0 means the same. The value will be inexact
  "cim:Length.unit":       {"@type":"@id",       "@value": "cim:UnitSymbol.m"},
  "cim:Length.multiplier": {"@type":"@id",       "@value": "cim:UnitMultiplier.k"}
}
ThomasRanvikEriksen commented 1 year ago

@VladimirAlexiev Thank you for your expertise. I actually prefer your last approach even if it is harder to use. But we need to discuss this in the group. I'm not a big fan of enforcing a profile to a UnitMultiplier

Sveino commented 1 year ago

@VladimirAlexiev

In addition, the choice of xsd:float for numbers means that they can be freely converted to scientific notation, and inexact arithmetics. Are you happy when your numbers get converted to eg this:

    cim:ACLineSegment.r        1.5999999945951E-2 ;
    cim:ACLineSegment.x        3.769911221053E-3 ;

In CIMXML we allow both notation. We will are planning to allow for both in CIMJSON-LD.

If the profile constraints the unit and multiplier the instance file should not deviate from it.

In that case do you want instance data to repeat these fixed unit and multiplier, or to omit them? I would like to omit them in the instance file. The knowledge graph should through reasoning get the information. If, however, the profile allow for multiple e.g. MW and KW, we would need to include it.

QUDT meter (http://qudt.org/vocab/unit/M), but in CIM we have unit and multiplier that gives it. This mean we cannot use SKOS:exactMatch.

True. Unfortunately CIM has taken upon itself the definition of UoM (rather than reusing established standards), and its approach to UoM is not quite good.

It could be argue that did this first :-) However, if we did not have the current profile and we would like to use QUDT how should the profile look like to support of MW when QUDT only support W and kW directly.

Base on the Example 1 in Semantic Sensor Network Ontology

   sosa:hasResult [
      a qudt-1-1:QuantityValue ;
      qudt-1-1:unit qudt-unit-1-1:DegreeCelsius ;
      qudt-1-1:numericValue "22.4"^^xsd:double ] 
   cim::Conductor.length [
      a qudt:QuantityValue ;
      qudt:unit qudt-unit-1-1:meter;
      qudt:numericValue "50"^^xsd:float ] 
Sveino commented 1 year ago

I think we should consider make the changes in the profile:

<rdf:Description rdf:about="#Length">
    <rdfs:label xml:lang="en">Length</rdfs:label>
    <rdf:type rdf:resource = "http://www.w3.org/2002/07/owl#DatatypeProperty" />
    <skos:definition  xml:lang="en">Unit of length. It shall be a positive value or zero.</skos:definition>
         <rdfs:range rdf:resource = "http://www.w3.org/2001/XMLSchema#float" />
         <rdf:type rdf:resource = " <http://qudt.org/1.1/vocab/unit#Meter>" />
    <eq:isCIMDatatype>true</eq:isCIMDatatype>
    <eq:Package>Package_CoreEquipmentProfile</eq:Package>
</rdf:Description>
VladimirAlexiev commented 1 year ago

@Sveino:

Can you please explain what are you trying to achieve with #Length? Define a data property that will hold "length in Meters"?

Sveino commented 1 year ago

@VladimirAlexiev In our UML information model and profile should include the relevant type declaration #Length, #ActivePower etc image

This is so that we can have a good communication with the domain experts. This information will be exported to the vocabulary described in RDF-Plus. This export will be described vocabulary that together with the RDF instance data provide a necessary knowledge graph. The profile/validation will be described in SHACL.

Issue 1: From the current profile we need to have the instance file look like:

             "cim:Conductor.length": {
                "cim:Length.value": 50.0
            },

The preference is to have a simpler exchange that is closer to the CIMXML

             "cim:Conductor.length": 50.0,

Issue 2: With the current profile can we not map well to QUDT. The profile should provide the following information:

ThomasRanvikEriksen commented 1 year ago

@Sveino @VladimirAlexiev

Hello. Json does not support decimal and this might make some issues regarding for example Money. The normal approach is then to use a string instead og float. So should we use string when we have decimal instead of float?

ThomasRanvikEriksen commented 1 year ago

Another question is how represent currency? (NOK, Dollar ++)

Sveino commented 1 year ago

Inspired by the following vocabularies:

The following vocabulary should support the requirement:

@prefix : <http://ucaiug.org/ns/CIM/Wire/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix qudt-unit: <http://qudt.org/vocab/unit#> .
@base <http://ucaiug.org/ns/CIM/Wire> .

<http://ucaiug.org/ns/CIM/Wire> 
    rdf:type owl:Ontology ;
    dcterms:title "CIM Wire Vocabulary"@en .
###  http://ucaiug.org/ns/CIM/Wire/Conductor
:Conductor rdf:type owl:Class ;
        rdfs:label "Conductor"@en ;
        rdfs:subClassOf :ConductingEquipment ;
        skos:definition "Combination of conducting material with consistent electrical characteristics, building a single electrical system, used to carry current between points in the power system.  "@en .
###  http://ucaiug.org/ns/CIM/Wire/Conductor.length
:Conductor.length rdf:type owl:ObjectProperty,
            owl:FunctionalProperty ;
        rdfs:label "length"@en ;
        rdfs:domain :Conductor ;
        rdfs:range [ 
        xsd:float ;
        qudt-unit:M
    ]; 
        skos:definition "Segment length for calculating line section capabilities. Must be positive value or zero."@en .
}

It is clear that we will managed to create a vocabulary that will meet the required so that we can have: "cim:Conductor.length": 50.0,

@VladimirAlexiev please confirm that the vocabulary above will work. @ThomasRanvikEriksen I am sure we would need to fix this in the vocabulary, so you can update the conversion script and the examples.

Sveino commented 1 year ago

Hello. Json does not support decimal and this might make some issues regarding for example Money. The normal approach is then to use a string instead og float. So should we use string when we have decimal instead of float?

We will in the vocabulary define the attribute as xsd:decimal (rdfs:range) and follow the JSON recommendation to use string notation.

Sveino commented 1 year ago

Another question is how represent currency? (NOK, Dollar ++)

In all new model we are not using Money, but Currency and Decimal - so all amount must be in the same currency: image

In this case Currency is exchanges as enumerator "nc:PowerBidSchedule.currency": cim:Currency.NOK

We should however consider to use EU Commition data set. In CIMXML we cannot exchange blank node, but in JSON-LD we will allow it:

"cim:GeneratingUnit.startUpCost": {
                "cim:Money.value": "20.0" ,
                "cim:Money.unit": cim:Currency.NOK 
            },

This would be as the alternative we where looking for length - but in this case we will not redefine the currency.

VladimirAlexiev commented 1 year ago

@Sveino This is wrong:

        rdfs:range [ 
        xsd:float ;
        qudt-unit:M
    ]; 
Sveino commented 1 year ago

@Sveino This is wrong:

        rdfs:range [ 
      xsd:float ;
      qudt-unit:M
  ]; 

@VladimirAlexiev What would be correct? The ObjectProperty is float with the unit of meter.

VladimirAlexiev commented 1 year ago

It is wrong

Sveino commented 1 year ago

@VladimirAlexiev Is this correct?

@prefix : <http://ucaiug.org/ns/CIM/Wire/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix qudt-unit: <http://qudt.org/vocab/unit#> .
@prefix qudt: <http://qudt.org/schema/qudt#> .
@base <http://ucaiug.org/ns/CIM/Wire> .

<http://ucaiug.org/ns/CIM/Wire> 
    rdf:type owl:Ontology ;
    dcterms:title "CIM Wire Vocabulary"@en .

###  http://ucaiug.org/ns/CIM/Wire/Conductor
:Conductor rdf:type owl:Class ;
        rdfs:label "Conductor"@en ;
        rdfs:subClassOf :ConductingEquipment ;
        skos:definition "Combination of conducting material with consistent electrical characteristics, building a single electrical system, used to carry current between points in the power system."@en .

###  http://ucaiug.org/ns/CIM/Wire/Conductor.length
:Conductor.length rdf:type owl:ObjectProperty,
            owl:FunctionalProperty ;
        rdfs:label "Length"@en ;
        rdfs:domain :Conductor ;
        rdfs:range :LengthValue ;
        skos:definition "Segment length for calculating line section capabilities. Must be positive value or zero."@en ;
        qudt:unit qudt-unit:M .

###  http://ucaiug.org/ns/CIM/Domain/LengthValue
:LengthValue rdf:type owl:Class,
        qudt:QuantityValue ;
        rdfs:label "Length value"@en ;
        rdfs:comment "A quantity value that represents a length."@en ;
        rdfs:subClassOf qudt:Quantity ;
        qudt:quantityKind qudt:Length ;
        rdfs:range xsd:float ;
        skos:definition "A quantity value that represents a length."@en .
VladimirAlexiev commented 1 year ago

:LengthValue (a owl:Class) cannot have rdfs:range

Sveino commented 1 year ago

So how to we defined that the expected value is xsd:float? I see that qudt is using: sh:datatype xsd:float ; I hope you understand what we would like to achieve.