ncbo / bioportal-project

Serves to consolidate (in Zenhub) all public issues in BioPortal
BSD 2-Clause "Simplified" License
7 stars 5 forks source link

Add ontology language #304

Open caufieldjh opened 4 months ago

caufieldjh commented 4 months ago

A sub-issue of #296.

I have curated language tags for all Bioportal ontologies as of today (Feb 12 2024) - just need a place to include them in each ontology entry.

Out of 1107 ontologies (may be a few duplicates in there): 1092 are in English (but some are bilingual), 11 in French, 5 in Chinese, 3 in Japanese, 3 in Spanish, 2 in Persian, and 1 in German.

jvendetti commented 4 months ago

@jonquet and/or @syphax-bouazzouni - do I understand correctly that in AgroPortal you chose to store ontology language in the naturalLanguage metadata attribute?

syphax-bouazzouni commented 4 months ago

@jonquet and/or @syphax-bouazzouni - do I understand correctly that in AgroPortal you chose to store ontology language in the naturalLanguage metadata attribute?

Yes, exactly, you can see the values here: https://data.agroportal.lirmm.fr/submissions?display=naturalLanguage&disaply_links=false&display_context=false&apikey=1de0a270-29c5-4dda-b043-7c3580628cd5

jvendetti commented 4 months ago

Thank you @syphax-bouazzouni , though I'm wondering why this attribute was chosen instead of using the hasRepresentationLanguage attribute from MOD?

jonquet commented 4 months ago

Hello @jvendetti A bit of history first: We have build MOD (then v1.4) and AgroPortal metadata model in parallel in 2017. Every time a property would be already there in BioPortal, we did not change it to stay backward compatible. And we pushed MOD 1.4 to rely has much as we can on OMV as BioPortla did in the past. Later in MOD2, we decided (with a larger group of people, and envisaging a better commitment and engagement of communities, RDA, EOSC, OntoPortal, etc..)that the properties from OMV will be dropped and readopted within the MOD namespace and sometime rename slightly (to represent the fact that MOD is not only for ontologies but for semantic artefact in general).

So, to encode the natural language of the ontology, OMV originally proposed omv:naturalLanguage property (OMV def: The language of the content of the ontology, i.e. English, French, etc.). This is still the one we use today.

Note that OMV also proposed omv:hasOntologyLanguage property (OMD def: OMV: The ontology language) which was already there in BioPortal to encode the "format" (I don't really like this word and OMV did not too and used 2 properties: one for the representation language and one for the syntax.. but BioPortal used only one at Natasha's time). Later in 2020, omv:hasOntologyLanguage was adopted by MOD2 and changed to mod:hasRepresentationLanguage. So this property is still the one we recommend to use (MOD def: MOD: A representation language that is used to create an ontology (e.g., OWL, RDF-S, SKOS).) ... even if BioPortal/AgroPortal does still use the old one.

(I will reply on the notion of "compliance with MOD" in your other issuehttps://github.com/ncbo/ontologies_linked_data/issues/192)

jvendetti commented 4 months ago

OK, thank you @jonquet for the historical perspective. This clears things up for me. I suppose for the sake of expediency, we should proceed with storing the language tags that @caufieldjh curated in the naturalLanguage attribute on the OntologySubmission class. My presumption is that we'd need to do that programmatically on the latest "ready" submission object for each ontology.

A possible issue with this attribute sticking - we don't expose a way in the Rails application for users to choose the natural language of their ontology when they manually create new submissions.

jonquet commented 4 months ago

Indeed, this is a property we added to the submission model. For your second point, indeed if you add a property, there needs to be an addition in the form for this...

The UI developped in https://github.com/ontoportal-lirmm/bioportal_web_ui (untill v2.4) shall be useful:

Capture d’écran 2024-02-22 à 14 06 59

Note that we don't recommend iso639-3 any more (too much values, no one use them)... we do recommend (and rely in the multilingual support code on iso639-1 values ... still from Lexvo.

jvendetti commented 2 months ago

@caufieldjh - all of the necessary modifications have been implemented / released to production to accommodate addition of the language tags that you curated. I think it would be easiest for me to do this programmatically on my end since you need to have admin privileges on ontologies in order to edit their latest submissions. Is this agreeable to you?

caufieldjh commented 2 months ago

Sure, I think that should be OK.