Sveino / Inst4CIM-KG

Instance of CIM Knowledge Graph
Apache License 2.0
5 stars 1 forks source link

number representation in JSONLD #120

Open VladimirAlexiev opened 3 weeks ago

VladimirAlexiev commented 3 weeks ago

(split off from https://github.com/Sveino/Inst4CIM-KG/issues/49)

in JSONLD it must be 1.23 and not "1.23" for a float.

I wrote in https://github.com/3lbits/CIM4NoUtility/issues/278 about the dangers of JSON numbers. JSON doesn't define its numbers:

https://json-ld.org/playground/

1) Convert this to NQuads :

{
  "@context": {
    "cim": "http://iec.ch/TC57/CIM100#"
  },
  "@graph": [{
    "@id": "urn:uuid:9d58e5bb-834c-4faa-928c-7da0bb1497d9",
    "cim:Equipment.normallyInService": true,
    "cim:Reactance.value": [0.123, 1.00000000000]
    }]}

The result is "0.123"^^xsd:double and "1"^^xsd:integer. So depending on the specific value, it's treated as different datatype!

2) Only if we add specific datatypes (in the context):

{
  "@context": {
    "cim": "http://iec.ch/TC57/CIM100#",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "cim:Equipment.normallyInService": {"@type": "xsd:boolean"},
    "cim:Reactance.value": {"@type": "xsd:float"}
  },
  "@graph": [{
    "@id": "urn:uuid:9d58e5bb-834c-4faa-928c-7da0bb1497d9",
    "cim:Equipment.normallyInService": true,
    "cim:Reactance.value": [0.12345678901234567890, 1.00000000000]
    }]}

We get what we want: "1.234567890123457E-1"^^xsd:float and "1"^^xsd:float. But 1.234567890123457E-1 has too many decimals for a float. What happened is that internally it was processed as double, and only spit out with float datatype.

3) More importantly, you cannot control whether to enclose numbers in quotes on output. I made tests with some tools (I'll commit the details to git):

tool reactance normallyInService
GraphDB "0.123" xsd:float "true" xsd:boolean
Jena riot "0.123" xsd:float "true" xsd:boolean
ttl2json "0.123" xsd:float true
Virtuoso context 0.1230000033974648 true
Virtuoso plain 0.1230000033974648 true

Note: in all cases we didn't specify a context to use (because it's not ready). If we do, then more tools may output values in quotes.


What is the harm if JSON numbers are enclosed in quotes?

Sveino commented 3 weeks ago

Does this example not show that we "must" allow for both?

My understanding is: The JSON-LD standard specifies that datatype declarations can be used to define the type of literal values, such as integers, dates, or custom data types, by adding a "@type" key within the context or within each value. This is optional in JSON-LD, as JSON-LD tries to be flexible with datatypes to accommodate varied data structures and maximize compatibility with JSON.

From this we have the following option:

  1. Requiring Datatype Declarations
  2. Not Allowing Datatype Declarations
  3. Allowing Both With and Without Datatypes

As we have the context as part of the profile I do not see that we have to require it (1). However, interoperability is fundamental so that is pointing to requiring it. For me I would recommend (3) - Allowing both.

Virtuoso behaviours is generally not accepted. However, I think we have in CIM standard defined some tolerance that this would be inside.