I think it would be better to allow multiple values
but impose a sh:uniqueLang constraint (skos:prefLabel has the same restriction).
In that way CIM data could accommodate multilinguality.
Eg looking at some random properties:
cim:IdentifiedObject.mRID: always string
cim:IdentifiedObject.description: string or langString
nc:AssessedElement.normalTargetRemainingAvailableMarginJustification: string or langString
Unfortunately, cim:String is used even for props that should not allow langString,
i.e. no distinction is made between these two cases:
Names/descriptions could be string or langString
But identifiers should only be string
So for the time being I think CIM implicitly forbids the use of langString:
if you cannot have multiple uniqueLang values, there's not much use for lang tags.
Also, allowing lang tags may cause some disturbance in some receiving system.
The datatype hierarchy is like this: rdfs:Literal > rdf:PlainLiteral > (xsd:string, rdf:langString).
What a text field needs to be mapped to depends on its nature:
xsd:string is appropriate for codes that are never translated to multiple langs
rdf:langString is appropriate for texts that are always translated to multiple langs (if not now, then in the future): so a lang tag is required
rdf:PlainLiteral is appropriate for texts that may but don't have to be translated, i.e. lang tag is not required. It is defined at https://w3.org/TR/rdf-plain-literal , and means string or langString.
If you want cim:String to allow langStrings, then we should map it to rdf:PlainLiteral.
This section was provoked by pondering the difference between
cim:String
andprofcim:StringFixedLanguage
.AFAIK, CIM does not allow (and has not considered?) multilinguality
rdf:LangString
but that doesn't countEg
cim:IdentifiedObject.name
doesn't allow multiple values:I think it would be better to allow multiple values but impose a
sh:uniqueLang
constraint (skos:prefLabel
has the same restriction). In that way CIM data could accommodate multilinguality. Eg looking at some random properties:cim:IdentifiedObject.mRID
: alwaysstring
cim:IdentifiedObject.description
:string
orlangString
cim:IdentifiedObject.name
:string
orlangString
nc:AssessedElementWithContingency.mRID
: alwaysstring
nc:AssessedElement.normalTargetRemainingAvailableMarginJustification
:string
orlangString
Unfortunately,
cim:String
is used even for props that should not allowlangString
, i.e. no distinction is made between these two cases:string
orlangString
string
So for the time being I think CIM implicitly forbids the use of
langString
: if you cannot have multipleuniqueLang
values, there's not much use for lang tags. Also, allowing lang tags may cause some disturbance in some receiving system.So I'll map
cim:String
toxsd:string
rdf:PlainLiteral
The EU eProcurement Ontology allows multilingual data, and used
rdfs:Literal
. But that datatype is way too broad, so I raised an issue: https://github.com/OP-TED/ted-rdf-mapping/issues/407The datatype hierarchy is like this:
rdfs:Literal > rdf:PlainLiteral > (xsd:string, rdf:langString)
. What a text field needs to be mapped to depends on its nature:xsd:string
is appropriate for codes that are never translated to multiple langsrdf:langString
is appropriate for texts that are always translated to multiple langs (if not now, then in the future): so a lang tag is requiredrdf:PlainLiteral
is appropriate for texts that may but don't have to be translated, i.e. lang tag is not required. It is defined at https://w3.org/TR/rdf-plain-literal , and meansstring
orlangString
.If you want
cim:String
to allow langStrings, then we should map it tordf:PlainLiteral
.