NatLibFi / Skosmos

Thesaurus and controlled vocabulary browser using SKOS and SPARQL
Other
218 stars 94 forks source link

Allow non-standard languages tags for labels #1085

Open nikokaoja opened 3 years ago

nikokaoja commented 3 years ago

I have several controlled terms that have labeles (skos:altLabel) in language(s) which are not standard, i.e., language tags do not conform to language standards, such as RFC 3066.

Is there a way to add non-standard languages to SKOSMOS manualy such they are displayed properly?

osma commented 3 years ago

Can you give an example of such a non-standard language tag? Are these something specific to your use case or potentially something that other Skosmos users may encounter as well?

There is one precedent for this: in 2016, support was added (see #437) for the custom language tag zxx-x-taxon which is used for scientific names for species and other taxons, e.g. "Felis domesticus"@zxx-x-taxon or "Oryza sativa"@zxx-x-taxon. These are not really in a natural language (it's not Latin!) and this tag was already used by some other communities to express scientific names.

In practice, this tag has been added to extra-msgids.twig to make it translatable using the normal gettext mechanisms, and the language specific labels (e.g. "Scientific name" for English) are specified in the translation files.

nikokaoja commented 3 years ago

@osma

These are specific for my use case, and they are not referring to the tags which represent any natural language, but more a tag that represents a certain standard that contain specific naming conventions.

Here is an example, the term is wind_speed which has a specific altLabel HorWdSpd according to IEC standard:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix wep: <http://data.windenergy.dtu.dk/ontologies/wind-energy-parameters/> .

wep:wind_speed a skos:Concept;
  skos:prefLabel "wind_speed"@en;
  skos:altLabel "HorWdSpd"@IEC61400_25_2;
  skos:definition "Speed is the magnitude of velocity. Wind is defined as a two-dimensional (horizontal) air velocity vector, with no vertical component. (Vertical motion in the atmosphere has the standard name upward_air_velocity.) The wind speed is the magnitude of the wind velocity."@en;
  wep:prefUnit "m s-1";
  skos:exactMatch <http://mmisw.org/ont/secoora/parameter/wind_speed>, <https://mmisw.org/ont/cf/parameter/wind_speed>;
  skos:broader wep:AtmosphericParameters;
  skos:inScheme wep: .

Following your suggestion for handling zxx-x-taxon, should I simply add:

{% trans "IEC61400_25_2" %}

to the file extra-msgids.twig so SKOSMOS can resolve it from Jena Fuseki db?

osma commented 3 years ago

Thanks for the clarification.

The extra-msdids.twig is not really necessary, it's just a way to tell the gettext command line tools about translatable strings that cannot otherwise be automatically extracted from the PHP and Twig codebase, so they get added into the .pot template file for translations. The important part is that you add the actual translations (msgid/msgstr pairs) into the .po files for individual languages - at least the ones you use as UI languages - and then recompile the .mo files. See the Translation page in the wiki for some more details on how this works.

However, it occurred to me that you could consider HorWdSped a notation instead of a label. Notations are often used for technical codes - for example, I've seen them used for chemical codes in agricultural vocabularies (in addition to the obvious use in library classifications such as DDC and UDC). See the SKOS Primer section on notations for some details.

You could define IEC61400_25_2 as an RDF data type that you attach to the literal values, like this:

wep:wind_speed a skos:Concept;
    skos:notation "HorWdSpd"^^wep:IEC61400_25_2 .

# define the data type
wep:IEC61400_25_2 a rdfs:Datatype ;
    rdfs:label "IEC61400"@en .

In my opinion this would be a cleaner way to represent codes that are not in a natural language in a SKOS vocabulary, but it's up to you how to model your data. Skosmos has some support for notations (they are searchable) but the data type on a notation is basically ignored currently, it's not shown in the UI. (Feature request and PR welcome!)

nikokaoja commented 3 years ago

@osma many thanks for feedbacks.

So far I implemented this as a custom RDF property, which is nicely diplayed in SKOSMOS: http://data.windenergy.dtu.dk/controlled-terminology/wind-energy-parameters/partial_failure_mode_factor

However, your suggestion on creating data type and using it to annotate the literal values of skos:notation is quite neat! Therefore, I will post a feature request for it since I think many engineering domains would benefit from having it.