OBOFoundry / OBOFoundry.github.io

Metadata and website for the Open Bio Ontologies Foundry Ontology Registry
http://obofoundry.org
Other
165 stars 204 forks source link

Should we recommend specifying language tag? #479

Open mcourtot opened 7 years ago

mcourtot commented 7 years ago

As per https://groups.google.com/forum/#!topic/obo-discuss/_x1MpwAjHQw, from Peter Midford:

Entering for string for a definition or a synonym in Protege I'm confronted with choosing a type (xsd:string) or a language (en), but it seems only one is allowed. Looking around in NBO, which I'm updating, it looks like type wins over language. So, my question is whether specifying type or language is the better practice in the OBO community.

nlharris commented 4 years ago

Relates to #325 and #437

nlharris commented 3 years ago

can someone answer @mcourtot's question?

alanruttenberg commented 3 years ago

I'd vote yes. It's the standard, and doing it removes one thing in the way of adding translations. Alan

On Tue, Dec 1, 2020 at 1:24 AM Nomi Harris notifications@github.com wrote:

can someone answer @mcourtot https://github.com/mcourtot's question?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-736155450, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3CDTAUP5C5XMWJX6I3JDSSRAUFANCNFSM4D2SSV6Q .

jamesaoverton commented 3 years ago

I agree with @alanruttenberg that a language tag is better than xsd:string for labels, definitions, synonyms, etc.

In practise, I see more xsd:strings than language tags but we should push to use language tags.

matentzn commented 3 years ago

I agree. I think xsd:string is essentially redundant - there is no good practical reason to annotate strings with xsd:string. I also believe language tags is the way to go here. Just telling the tooling that will be quite a challenge..

yongqunh commented 3 years ago

I agree as well. A language tag is better than xsd:string.

nlharris commented 3 years ago

does this recommendation still need to be added somewhere?

matentzn commented 3 years ago

I added the Operations Commitee tag to just put this up for vote.

I think its straight forward to vote that we want to use language tags over xsd:string. However, the big question mark is what we want to recommend when comparing "nothing" to @en - there will be a lot of screams of agony if we require all English language labels, definitions etc to get an @en tag. But maybe that's the way to go to break the dominance of the English language in truly global world! I would vote for it, and I would volunteer helping the Foundry ontologies migrate. However, there are voices (I am sure @cmungall is one of them) that would say that "@en" on all literals will confuse the users :D But even here - we could say: use @en everywhere, and if your users are confused, export a version of your ontology without language tags. So, two votes:

Suggestion: Recommend to use language tags instead of xsd:string, and add the recommendation to the "common format" principle. This will require changes to obo2owl format parser and some work on the curation side. We wont require language tags across the board (gene names, peoples names, xrefs etc, thanks @alanruttenberg ), but ROBOT report will produce a warning if a class in an ontology has a label, synonym or definition that does not have a language tag.

cmungall commented 3 years ago

voices (I am sure @cmungall is one of them) that would say that "@en" on all literals will confuse the users

I am all for not confusing users, but I am not sure how this would confuse users, most of whom interact via OLS etc

All seems reasonable on the surface. I think the challenge is with the tooling, not policy. Provide people tools and they will do the right thing.

The main tooling need is in the obo2owl code. If standard sparql updates are provided in odk/robot then it will be easier for maintainers to migrate. But ideally this would be in the owlapi conversion code. That way there is no confusion in having the edit version be different owl than the release version. I don't think adding this to the owlapi is so hard but someone needs to manage the migration process.

I also think you need to give clear guidance on how to migrate. Many ontologies may use latin terms. Doing a replace-all of string to @en will yield incorrect results. Unless we consider the fact that a latin term is acceptable in formal english speaking contexts? Or maybe we should require two labels? Or one label plus an exact synonym?

There are methods to be able to infer whether a term is english or latin but this is work we would be putting on ontology developers, many of which have to balance limited resources against actual requests from curators rather than formal ontologists.

pbuttigieg commented 3 years ago

In some of our UN work, language tags are very desirable for obvious reasons, and more interoperability efforts are also asking for multilingual support. Supportive of the language tag.

alanruttenberg commented 3 years ago

I'm for the language tag when appropriate. That's the standard. It isn't appropriate for things that aren't language specific, such as the short name of a gene or protein. I'm not sure it is appropriate for names of people. It isn't appropriate for a value that is an IRI. It may be something that can be added by ROBOT so as not to confuse the poor biologists.

On Tue, Jun 15, 2021 at 10:30 AM Pier Luigi Buttigieg < @.***> wrote:

In some of our UN work, language tags are very desirable for obvious reasons, and more interoperability efforts are also asking for multilingual support. Supportive of the language tag.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-861649815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB3CDUJVDPBJZET4S6VKQTTS552PANCNFSM4D2SSV6Q .

matentzn commented 3 years ago

Ok the vote is now ready: a simple yes or no question: https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-857467300

cthoyt commented 3 years ago

Ok the vote is now ready: a simple yes or no question: #479 (comment)

Not sure if this is planned to become more common but I like the idea that votes take place via github issues / comments.

matentzn commented 3 years ago

Open action items:

nlharris commented 2 years ago

Looks like the vote so far is 3 yes and 2 "thumbs up" (not mentioned as a voting option, but I think we can assume those are also yeses).

matentzn commented 2 years ago

Ok, the outcome of the vote here is that we start recommending language tags in place of DT(string) for labels.

Next steps:

nataled commented 2 years ago

Not a pushback on the outcome of the vote, but on the process. One of the early decisions regarding new or changed principles is that they are discussed and voted on in an Operations call before wording is added by the EWG. I see that there was discussion of this during a call, but it's not clear to me that a vote was taken during a call. Has that happened?

matentzn commented 2 years ago

Sure, we can raise this one more time at the OFOC! Makes sense. I don't remember exactly what has happened wrt to the discussion there. So best just finalise the decision next Tuesday! Thank you @nataled

ddooley commented 2 years ago

Is there any guidance about what language tags are good, and which might be malformed? We're looking into permitted language variants over in https://github.com/FoodOntology/joint-food-ontology-wg/issues/25

alanruttenberg commented 2 years ago

https://www.w3.org/International/questions/qa-choosing-language-tags It cites https://www.rfc-editor.org/rfc/rfc5646.txt as the most current RFC.

matentzn commented 1 year ago

@nataled Action items:

https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-1040038633

hoganwr commented 1 year ago

A couple thoughts, and hopefully I am not stirring a hornet's nest:

  1. Should we be specific about which annotation properties this applies to? rdfs:label, skos:prefLabel, certain things from OMO?
  2. In some cases as Alan R. points out, xsd:string is absolutely the correct type. For example when annotating non-IRI identifiers such as RxCui on classes. Those identifiers are strings of numerals (and I would argue not numbers, but that's not important right now).
nataled commented 1 year ago

@hoganwr based on https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-1040038633 it should be applied to label only (at least for now).

@matentzn I'm going to need some text along with specific instructions to be provided to users. Best if these instructions include directions for both OWL and OBO formats, but if you don't know the latter I can probably figure it out using an OWL-to-OBO converter.

I should mention that I'm becoming increasingly concerned that we are overloading the principles with directives that are quite ancillary to the principle at hand. This language tag thing, for example, while referring to format, is not really related to the format principle, which is about the overall artifact format (OBO, OWL, JSON, etc) and not about specific fields. I'm thinking we need to separate principles from specific details. I plan on raising this issue in a OFOC call.

hoganwr commented 1 year ago

great point re: principles vs. specifics. As a fellow editorial WG member, this is just one more time where the match between a principle and a particular directive like this one is not obvious. I support discussion of splitting, what it means, and how to implement it.

On Tue, Jun 27, 2023 at 4:03 PM Darren A. Natale @.***> wrote:

@hoganwr https://github.com/hoganwr based on #479 (comment) https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-1040038633 it should be applied to label only (at least for now).

@matentzn https://github.com/matentzn I'm going to need some text along with specific instructions to be provided to users. Best if these instructions include directions for both OWL and OBO formats, but if you don't know the latter I can probably figure it out using an OWL-to-OBO converter.

I should mention that I'm becoming increasingly concerned that we are overloading the principles with directives that are quite ancillary to the principle at hand. This language tag thing, for example, while referring to format, is not really related to the format principle, which is about the overall artifact format (OBO, OWL, JSON, etc) and not about specific fields. I'm thinking we need to separate principles from specific details. I plan on raising this issue in a OFOC call.

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/479#issuecomment-1610139234, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55VJZNCASXP3RE4BHELXNM4CJANCNFSM4D2SSV6Q . You are receiving this because you were mentioned.Message ID: @.***>

matentzn commented 1 year ago

There will be some iterations on this on the PR @nataled but you can start with this:

- For rdfs:label and IAO:0000115 annotation assertions, we discourage the use of datatype declarations such as `xsd:string`. It is important to note that `xsd:string` is essentially redundant in OWL/RDF, so "assay" and "assay"^^xsd:string should be the exact same thing. However, a lot of tooling may be confused by the difference, xsd:string datatype assertion SHOULD be omitted in general for all annotations, but MUST be omitted for rdfs:label and IAO:0000115.
- To designate rdfs:label, and IAO:0000115 annotations in a language different from English, a [valid RDF language tag](https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal) MUST be specified, for example, "Krankheit"@de.
- rdfs:label and IAO:0000115 annotation assertions for English content MAY be annotated with an English language tag. If the ontology chooses not to use language tags, a protege:defaultLanguage assertion MUST be added as an ontology annotation.
alanruttenberg commented 1 year ago

@matentzn I'm confused. The votes and discussion suggests use of language tags, but the text you suggest effectively says to not use them for english.

matentzn commented 1 year ago

@alanruttenberg good Point, i forgot adding a note about that. I made it a bit less restrictive now, and added a third bullet on how to deal with English.