information-artifact-ontology / ontology-metadata

OBO Metadata Ontology
Creative Commons Zero v1.0 Universal
19 stars 8 forks source link

Request for has_acronym #135

Closed csbjohnson closed 8 months ago

csbjohnson commented 1 year ago

IRI

No response

Label

Acronym (has_acronym)

Definition of the property

I'd like to request the creation of _hasacronym, to be able to specify when an acronym is being used.

Would truly appreciate your time and assistance. Thank you!

Best, Claudia Sánchez-Beato Johnson

Parent property

No response

What is the range of the property in question?

xsd:string

Examples of use

DOID:0050214 Name: Lambert-Eaton myasthenic syndrome

Synonyms label: Eaton-Lambert syndrome [EXACT], Lambert-Eaton syndrome [EXACT], LEMS [EXACT]

Eaton-Lambert syndrome [EXACT], Lambert-Eaton syndrome [EXACT] are synonyms, different words, yet LEMS is not providing a different word yet representing the same name provided in the DOID with abbreviation. Therefore representing an acronym as a synonym wouldn't be accurate and it would be truly beneficial to be able to specify it's classification with it's own label.

Motivation to add

In order to represent acronyms by their own label rather than represent them as synonyms as they aren't a different word but an abbreviation of that same term.

ORCID, ROR or Wikidata identifier of the contributor

/

OMO Checklist

alanruttenberg commented 1 year ago

This would be a great addition, IMO. But what is the intended domain of the relation. Seems to me that it should be related as an annotation to the fully spelled out label. There's a question in my mind as to whether there should be a direct relation as an alternative label of the term. But to the extent that an acronym could be of a label that is any of a variety of types of synonyms, I don't think we want to replicate the various synonym properties specialized to acronyms.

matentzn commented 1 year ago

Thank you for the request!

Generally, we consider acronyms, abbreviations, etc "synonym types".

There are several such synonym types in OMO already, and as discussed here, we should definitely add acronym to the list, and I am happy to do it.

However, you would not proceed to say:

DOID:0050214 "has acronym" LEMS

If you use this pattern, you would say (pseudocode, given a new synonym type OMO:123 "acronym"):

DOID:0050214 "has exact synonym" LEMS [oio:SynomymType=OMO:123]

Find some examples here:

https://api.triplydb.com/s/FIf2uoYo9

Another closely related issue is https://github.com/information-artifact-ontology/ontology-metadata/issues/122

Would you be ok with this using this pattern as well?

csbjohnson commented 1 year ago

Thank you for the pattern provided, encoding it that way would still not allow a separation of true synonyms from acronyms . I'd like to keep my request for "has_acronym synonym" as I believe it is a direct way for increased accessibility to separate true synonyms from acronyms. Without it, it wouldn't be user friendly by lacking visibility and making it more complicated to be queried.

This addition would serve great value to the community.

Thank you for your time.

Best, Claudia Marie

matentzn commented 1 year ago

would still not allow a separation of true synonyms from acronyms

How so? They are clearly separated, albeit in a bit of a cumbersome manner..

it wouldn't be user friendly

This is true.. But I don't know then how exactly I should play my role here as a shepherd for OMO.

This is the dilemma:

  1. There is a pattern in OMO for synonym types
  2. A user (you) does not like the pattern (for a good reason!) and suggests to implement another parallel pattern (it is, after all, a pattern, as it would open the floodgates for "has abbreviation", "has layperson synonym" etc).

So either I violate my community-entrusted role of evolving a coherent OMO with a single way of doing things, or I violate my dedication to you, the user, which are both of equally important to me!

So unless you can convince me that an acronym is actually not a synonym at all, I am afraid you will have to find someone in OBO to muscle your request past me (which is definitely possible!). Or else tell me how to solve my conundrum. :( Sorry, sorry.

matentzn commented 11 months ago

@csbjohnson - did you have any further thoughts on what I was saying? Disagreement with something specific? Ideas on how to move forward? I really don't like when community members such as yourself take the time to reach out to an ontology and making a request (which is really great), just to be rebuffed by technical people like myself for formal reasons.. I just need a good reason not to stick to the past design decision!

csbjohnson commented 11 months ago

Hi @matentzn, thank you for your time. I believe that this implementation is valuable.

Please see following definitions of acronym and synonym:

-Acronym: An acronym is a word or name formed as an abbreviation from the initial components in a phrase or a word, usually individual letters (as in NATO or laser) and sometimes syllables (as in Benelux).

-Synonym A synonym is a word or phrase that means exactly or nearly the same as another word or phrase in the same language. Some lexicographers claim that no synonyms have exactly the same meaning (in all contexts or social levels of language) because etymology, orthography, phonic qualities, ambiguous meanings, usage, and so on make them unique. Different words that are similar in meaning usually differ for a reason: feline is more formal than cat; long and extended are only synonyms in one usage and not in others (for example, a long arm is not the same as an extended arm).

-Examples in which requested has_acronym addition holds place to prevent inaccuracies:

cmungall commented 11 months ago

We should really have this in the FAQ. For better or worse, OMO/oboInOwl takes a liberal interpretation of "nearly the same as". Our "broad synonym" properties and so on don't make any sense with a stricter reading of "synonym".

However, these properties have been standard for some 25 years or so in ontologies like GO, DO, Uberon, etc

matentzn commented 11 months ago

@csbjohnson thanks for the details.

@cmungall (and others), what is your opinion? Is an acronym a synonym (thereby falling under the purview of the axiom annotating pattern) or is an acronym sufficiently different conceptually from a synonym to justify a separate property?

bpeters42 commented 11 months ago

I am probably not understanding something. Why not introduce 'has acronym' as a sub-property of 'alternative term', and sibling of the existing 'has broad synonym' etc.? For those not caring about the distinction, just using any type of 'alternative term' relationship would work? Is the problem because we can't do children of annotation properties? I dimly recall that but am not up to speed.

On Sat, Jul 15, 2023 at 8:47 AM Nico Matentzoglu @.***> wrote:

@csbjohnson https://github.com/csbjohnson thanks for the details.

@cmungall https://github.com/cmungall (and others), what is your opinion? Is an acronym a synonym (thereby falling under the purview of the axiom annotating pattern) or is an acronym sufficiently different conceptually from a synonym to justify a separate property?

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/135#issuecomment-1636806226, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IQO6VIN7Y77QI6PNHTXQK3STANCNFSM6AAAAAAZJOK5LQ . You are receiving this because you are subscribed to this thread.Message ID: <information-artifact-ontology/ontology-metadata/issues/135/1636806226 @github.com>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

matentzn commented 11 months ago

@bpeters42 I thought I made this clear above: because there are then two different ways to say that something is an acronym, see for example this comment: https://github.com/information-artifact-ontology/ontology-metadata/issues/135#issuecomment-1595796162

bpeters42 commented 11 months ago

Sorry Nico; I guess I understand better now. At least enough to know that this is not an issue that I have a strong enough opinion on to want to decide the outcome. In our application, we typically allow all alternative labels equally and often introduce custom label preferences that come with the specific context in which terms are used. I guess I am trying to say that universally agreed classifications into exact / broad / acronym / synonym / abbreviation / nicknames etc. might not be feasible. I also suspect that I am not helpful, so please ignore me!

On Sat, Jul 15, 2023 at 10:38 AM Nico Matentzoglu @.***> wrote:

@bpeters42 https://github.com/bpeters42 I thought I made this clear above: because there are then two different ways to say that something is an acronym, see for example this comment: #135 (comment) https://github.com/information-artifact-ontology/ontology-metadata/issues/135#issuecomment-1595796162

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/135#issuecomment-1636837034, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IR2HRV5NKCXNOBWYJTXQLISJANCNFSM6AAAAAAZJOK5LQ . You are receiving this because you were mentioned.Message ID: <information-artifact-ontology/ontology-metadata/issues/135/1636837034@ github.com>

-- Bjoern Peters Professor La Jolla Institute for Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters

graybeal commented 11 months ago

@csbjohnson thanks for the details.

@cmungall (and others), what is your opinion? Is an acronym a synonym (thereby falling under the purview of the axiom annotating pattern) or is an acronym sufficiently different conceptually from a synonym to justify a separate property?

One anecdotal response: An acronym is not a synonym (and vice-versa). They follow different grammatical, syntactical, and conceptual rules. When someone says "I need a synonym for X", I think very few people will answer with an acronym. (Of course, things change, maybe younger people would!)

Both acronyms and synonyms may be suitable labels for something, but so might codes or icons—it doesn't make them synonyms.

matentzn commented 11 months ago

Alright, thank you all for the discussion. While I do not exactly agree with the line drawn by you all between synonym and acronym (IMO they are both literals used to refer to a conceptual entity, regardless of whether the word "acronym" is perceived as a "synonym" to the term "synonym" or not :-)), I do see the practical value in separating synonyms from acronyms, for example during QC time (as @csbjohnson points out in various examples, they can overlap significantly and we have numerous examples to support that assumption - and we want to check that we do not assign the same exact synonym to multiple terms).

I have made a PR: https://github.com/information-artifact-ontology/ontology-metadata/pull/138

Please provide your feedback, and any orcids you want me to add as "contributors".

cmungall commented 11 months ago

[writing this fairly quickly these same arguments have been rehashed again and again, apologies for typos/repetition]

As always with any ontology concept, we all like to focus on the string used to describe the concept, rather than the concept itself, and this applies to a metadata ontology as much as a domain ontology.

I fully accept that "synonym" was a bad primary label to choose for the concept under discussion here. It leads people to overly focus on how that string is used in their community rather than the concept itself. I suggest for purposes here we focus on what concepts OMO needs, how they should be organized, and how they should be used by applications.

the oboInOwl synonym predicates are for relating a domain concept to a string that is used by humans as a name, where the relationships is either exact, similar but narrower in some contexts, similar but broader in some contexts, or otherwise related. Ontology tools SHOULD use all synonyms when implementing search (and of course they MAY use other predicates), and they SHOULD also use the synonym predicate in ranking search results and in providing information to the user on why something matched. Ontology tools SHOULD consider all synonyms in applications like NER, and MAY use the predicate to rank results. You can find further guidelines on places like the uberon wiki.

Metadata about how the string is constructed - is it an acronym, a portmanteau of an acronym and a spelled out term, is an orthogonal concern.

Like any system, there are edge cases here. It could be argued that HGNC symbols are more like identifiers than synonyms, and indeed sometimes we see novel constructions like HGNC:BRCA.

The existing system has worked for decades. If we want to be ontologically fussy and come up with a complicated alternative then people proposing this need to do more than object on minor terminological grounds and propose an alternative system together with a description of how this will be rolled out and implemented in major software systems in a way that doesn't cause users to get incomplete results.

A valid way to do this is by having an open ended set of APs that inherit from a common parent, as @bpeters42 suggests. Note this kind of system is already used in some ontologies that don't follow oboInOwl, with APs inheriting from "alternative term". But if we open this gate, do we also introduce other APs that function as oboInOwl synonyms? E.g. "has symbol"? "has gene symbol"? "has code"? Or is acronym a special one-off?

If the system involves an open ended lattice of APs connected by subAnnotationProperty of then the guidelines should explain how tool implementers should obtain this and use it to dynamically drive behavior in a robust way. I assume the intent is that the applications should do an initial lookup of OMO, obtain all the transitive subproperties of the root synonym/alternativeLabel AP, and use this to drive behavior of the tool (search, NER, etc).

This is not unreasonable, but if this is the plan, someone needs to document it, coordinate with developers and ensure it is rolled out consistently.

As a data point, note that OBI and other IAO-based ontologies have employed this system for some time. In OBI you can see an AP hierarchy:

'alternative term'

But this system was never documented and there was no coordination with tool developers. As a result, if someone uses one of these subproperties for an alternative term, it is ignored by software for purposes of search, including the 3 main portal providers. This is despite this system being in use for well over a decade.

(funnily enough, many of these "alternative terms" are also acronyms)

In contrast, the existing system used by GO, Uberon, and many other ontologies cleanly separates concerns, has a simple implementation that is largely adhered to by most software. As far as I can tell there are no practical issues and it's purely a nomenclature choice, but just read "synonym" as "alternative term" and everything is fine.

See also:

bpeters42 commented 11 months ago

I agree with pretty much all that Chris wrote. But I wanted to add one bit of background for why the sub-properties like 'IEDB alternative term' were created. That came about from having discussions about labels between different projects that contribute to OBI, which are much more varied than the typical GO derived communities. To reduce the need for those discussions, we allowed everyone to use their own 'alternative label'. This meant every project can replace the OBI rdfs:label that they didn't like with whatever their community prefers. In contrast, the OBI rdfs:label has to be precise, unique and distinguishable across all of OBI, which often means it is long and clunky. We wanted to make these alternative terms available for anyone wanting to do text mining, but we did not want to argue about if something is broad, narrow, syonym, acronym or whatever. This OBI practical solution has worked very well over a decade for its intended purpose. It does not separate the concerns that Chris mentions, but it does separate the concern I mentioned in myoriginal comment that different people will have very different idea on what labels are good / narrow / whatever, and that those discussions are often not productive.

cthoyt commented 11 months ago

I'm with nico's original post - I think we should continue considering acronyms as a type of synonym (since in practice we want to use these the same way) and use the recent work in OMO to add new synonym standard synonym types to mediate this. Having tons of properties to look for this stuff instead of in a single place with a well-defined data model will make it less easy and enjoyable to use ontologies.

balhoff commented 11 months ago

I agree with @cthoyt. I don't support adding an acronym property; I do support standardizing synonym type identifiers since these are currently a mess of ontology-local hash IRIs.

dosumis commented 11 months ago

Strongly agree with Chris' points with one exception - symbols.

For cell types, we have very strong (I would even say critical) use-cases for official symbols. Long labels are needed for disambiguation purposes, but no-one in the community uses them (similar issue: FlyBase has both gene symbols and official full names). Where consensus emerges on particular symbols as standard, we need to reflect this so that the tools we build have a single, official, compact way to refer to cell types that reflects dominant community usage. This is particularly important for atlases, where, for real-estate reasons, overlay annotation needs short symbols that users can understand.

We are following this approach in FBbt, PCL, and are starting to follow it in CL (currently using an IAO ID). In order to support default assumptions about indexing OBO ontologies for search, we follow an SOP that all symbols should also be exact synonyms - this also allows us to add supporting references.

cmungall commented 11 months ago

I can see the rationale here. I don't love the duplication. I assume the exact synonyms are added at release time. This means that some queries that work on the release will not work on the edit version. Of course we have cases like this already, but I think this kind of duplication is best minimized.

Here is a radical suggestion: use the official symbols as the primary label (rdfs:label). As you say, this is what the community uses. For us fussy ontologists in the minority, include the full expanded label using the 'obo foundry unique label' (as is already policy) or a tagged exact synonym.

This is already the convention in most database conversions to obo/owl (see https://github.com/biopragmatics/obo-db-ingest/, neo, ...), e.g.

hgnc:10848 a owl:Class ; rdfs:label "SHH" ; ns2:IAO_0000115 "sonic hedgehog signaling molecule" ; ns1:hasDbXref "ccds:CCDS5942", "ensembl:ENSG00000164690", "merops:C46.002", "ncbigene:6469", "omim:600725", "orphanet:118703", "refseq:NM_000193", "ucsc:uc003wmk.2", "vega:OTTHUMG00000151349" ; ns1:hasExactSynonym "HHG1", "MCOPCB5", "SMMCI", "TPT", "TPTPS", "sonic hedgehog (Drosophila) homolog" .

PRO of course uses fully spelled out labels for (most?) proteins but perhaps it would be an idea to use labels like "Human SHH protein", relegating the fully spelled out name to a specially designated synonym

I realize that while this may be conventional for genes it's less conventional for cell types. But if a symbol reflects dominant community usage, and there is some stable nomenclature guidelines, why not? Especially for PCL-level granular cell types that don't have direct classical counterparts.

On Fri, Jul 21, 2023 at 3:30 AM David Osumi-Sutherland < @.***> wrote:

Strongly agree with Chris' points with one exception - symbols.

For cell types, we have very strong (I would even say critical) use-cases for official symbols. Long labels are needed for disambiguation purposes, but no-one in the community uses them (similar issue: FlyBase has both gene symbols and official full names). Where consensus emerges on particular symbols as standard, we need to reflect this so that the tools we build have a compact way to refer to cell types that reflects dominant community usage. This is particularly important for atlases, where, for real-estate reasons, overlay annotation needs short symbols that users can understand.

We are following this approach in FBbt, PCL, and are starting to follow it in CL (currently using an IAO ID). In order to support default assumptions about indexing OBO ontologies for search, we have as SOP that all symbols should also be exact synonyms - this also allows us to add supporting references.

— Reply to this email directly, view it on GitHub https://github.com/information-artifact-ontology/ontology-metadata/issues/135#issuecomment-1645364771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOPIB7DEFNC4LGJ4PITXRJK37ANCNFSM6AAAAAAZJOK5LQ . You are receiving this because you were mentioned.Message ID: <information-artifact-ontology/ontology-metadata/issues/135/1645364771@ github.com>

balhoff commented 8 months ago

In discussion at OFOC meeting 2023-10-31 including @csbjohnson and @lschriml, there was consensus that we should create a new 'acronym' synonym type (not a new property).

cthoyt commented 8 months ago

@balhoff was there discussion of if we can include this as a subproperty of "abbreviation"?

cthoyt commented 8 months ago

Either way, I can take care of making this new property.

balhoff commented 8 months ago

@balhoff was there discussion of if we can include this as a subproperty of "abbreviation"?

No this wasn't discussed. It's kind of a weird situation, since these things aren't really properties (i.e. relationships) but just put under OWLAnnotationProperty out of convenience (visible in Protege, but no logical commitment).