ncbo / bioportal_web_ui

A Rails application for biological ontologies
http://bioportal.bioontology.org/
Other
21 stars 2 forks source link

Add a natural language selector to the ontology submission form #311

Closed jvendetti closed 3 months ago

jvendetti commented 4 months ago

Give users the ability to select the language of the content of their ontology when they add or edit ontology submissions. The available options should match the ISO 639-1 standard.

Example options for the select box:

['English', 'en'],
['French', 'fr']

User selections will be recorded in the naturalLanguage attribute on OntologySubmission.

This is a necessary enhancement to accommodate https://github.com/ncbo/bioportal-project/issues/304.

jonquet commented 4 months ago

You can see here the restrictions for the naturalLanguage property including URIs (let's think semantic web!) for the values:

https://github.com/ontoportal-lirmm/ontologies_linked_data/blob/master/config/schemes/ontology_submission.ym

jvendetti commented 3 months ago

Hi Clement - I'm aware of AgroPortal's choice to use Lexvo.org URIs for language representation. I broached this topic here at the lab, and also with the team at Berkeley. There was resistance to adopting Lexvo, mainly because others felt that it doesn't naturally align with a variety of standards that use string-based representations, e.g.:

JSON-LD: "@language": "en"

RDF: "Hello World"@en

SHACL: sh:languageIn ( "en" )

ShEX: ex:name rdf:langString @en ;

jonquet commented 3 months ago

Well.. not exactly. Here you're giving me example of syntax to express that a label is in a given language (well is actual meaning is "this is the label used in this langage to identify this thing")

On my side, I am talking about identifying a langage to assign it as a value of a RDF property. It's not the same. Basically we are opposing the following triple:

ONTO-URI omv:naturalLanguage "en" with the triple ONTO-URI omv:naturalLanguage http://lexvo.org/id/iso639-1/en

The latest triple use an URI to identify the langage, which means a machine can use this too:

We are in the semantic web application and the semantic is built out of URIs not conventional ISO standard code.

You can still counter argue than in OMV omv:naturalLanguage is defined has a DataProperty and not an ObjectProperty indeed. But we have not taken care of the logical coherence of the metadata in BioPortal much (in fact the property you will have to stay consistent with BioPortal shall certainly be metadata:omvnaturallanguage) which at the end is a totally different property from OMV making somekind of reference to OMV. In the future , OMV has been "adopted" by MOD and MOD recommend to use dct:language which does not define a formal range (only sugegstion with "range includes" )

So I still believe we shall use an URI for as much metadata value as possible (license, ontology types, syntax, representation language) etc.

If you don't like Lexvo, consider the LOC URIs e.g., http://id.loc.gov/vocabulary/iso639-1/en

jvendetti commented 3 months ago

@cmungall @matthewhorridge @martinjoconnor

Would any of you care to comment on the topic of whether to represent ontology language content in BioPortal as Lexo URIs vs two-letter ISO 639-1 strings?

Should this be a separate issue in another location?

I'm trying my best to be accommodating, while at the same time moving BioPortal forward on this without a lot of delay.