geneontology / amigo

AmiGO is the public interface for the Gene Ontology.
http://amigo.geneontology.org
BSD 3-Clause "New" or "Revised" License
29 stars 17 forks source link

Synonym types are not displayed #290

Closed elserj closed 6 years ago

elserj commented 8 years ago

Original ticket in planteome/planteome-ontology-browsers#17.

Basically, we have lots of synonyms and use "types" to differentiate them and the new AmiGO2 doesn't display the type info. For comparison: (old style) - http://plantontology.org/amigo/go.cgi?view=details&search_constraint=terms&depth=0&query=PO:0009001 (new) - http://dev.planteome.org/amigo/term/PO:0009001

Maybe the issue is that it is picking up the "alt id" type, but putting them all as that rather than separate them out?

kltm commented 8 years ago

@elserj, I've read this ticket two separate ways. The first is that it looks like the alt_ids and "synonyms" are not separated. They actually are, but the "synonyms" list does not have an alternate header, so it looks like they are one and the same. I'm going to reverse the order there to make that distinction more clear. Let me know if that is not what you're looking for.

For my second reading (which was my initial one but now seems more unlikely), I wanted to address the fact that no distinctions are currently made in synonym types (just to have some documentation for it in the tracker). This was actually a conscious decision made early on in the GOlr design (although I'm having trouble finding any relevant conversation in the tracker, from bits and pieces, the conversation seems to have happened in very late 2011/early 2012). The essence, as I recall, was that instead of having an arbitrary set of n fields that are all different synonym types for a particular use profile ("exact synonym", "narrow synonym", etc.), as it is mostly about users just finding things, they can all be lumped into a single much more easily searched and handled bin "synonym". The arbitrary number of synonym types actually adds a lot of overhead as you must then track which are the synonym fields and unify them when necessary, etc. A possible workaround would be to add a special use synonym JSON field that would keep track of the types for display purposes, but otherwise be ignored, much like the other JSON fields currently in use in GOlr. However, this was a marginal enough usecase at the time that we shelved the issue.

kltm commented 8 years ago

By popular request, tagging @cmungall and @hdietze (as if your inboxes weren't full enough).

Also, after a discussion with @cmungall about the history of grouping alt_ids in with synonyms, I'm just going to split that out into a separate field.

cooperl09 commented 8 years ago

I think the revised display with the Alt ids separated out from the synonym field is better, but this does not address the original issue of the synonym types.
Original ticket in Planteome/planteome-ontology-browsers#17.

kltm commented 8 years ago

@cooperl09 That is correct, depending on the formulation of the issue. As it currently stands, there is no architectural support for different synonym types in GOlr, only differentiating between all synonyms as a group and alternate ids. If the use case is that users actually need to know that a particular synonym is "narrow", "exact", or something else, we should add a new ticket for that. That said, I would be interested in knowing what a user would need that differentiation for--most of our users seem to be interested in just getting to the term, rather than its synonym flavor. In the case of early GOlr design (https://github.com/geneontology/amigo/issues/290#issuecomment-173022133), it was decided that the distinction was not particularly useful and that A2 should not necessarily follow the verbose display of A1.

austinmeier commented 8 years ago

Would it be possible to split each synonym type out into separate fields? Treat the 'has_broad_synonym" / "has_narrow_synonym", et cetera, relationships the same as "has_alternate_id" was treated to create a new field?

cooperl09 commented 8 years ago

@kltm re: "That said, I would be interested in knowing what a user would need that differentiation for--most of our users seem to be interested in just getting to the term, rather than its synonym flavor." In the Plant Ontology, we follow the GO guidelines in assigning the synonyms: http://geneontology.org/page/ontology-structure

For the PO the types of synonyms are listed here: http://wiki.plantontology.org/index.php/PO_Developers_Guide#Synonyms

These are important distinctions for our users as on some terms such as whole plant (PO:0000003) where there are synonyms of the different scopes:

Synonyms related: clonal colony related: colony narrow: bush narrow: frutex narrow: frutices narrow: gametophyte narrow: herb narrow: liana narrow: prothalli narrow: prothallium narrow: prothallus narrow: seedling narrow: shrub narrow: sporophyte narrow: suffrutex narrow: suffrutices narrow: tree narrow: vine narrow: woody clump exact: planta entera (Spanish) exact: 植物体全体 (Japanese) broad: genet broad: ramet

These are useful for data curation to help decide the correct term to use.

cmungall commented 8 years ago

minor terminology point. For historic reasons we call these 'scopes'. 'types' are open-ended and largely orthogonal to scopes.

This document should apply to planteome too: https://github.com/obophenotype/uberon/wiki/Using-uberon-for-text-mining (ignore the TM part, the documentation on synonyms is valid regardless of use case)

On 20 Jan 2016, at 10:26, Austin Meier wrote:

Would it be possible to split each synonym type out into separate fields? Treat the 'has_broad_synonym" / "has_narrow_synonym", et cetera, relationships the same as "has_alternate_id" was treated to create a new field?


Reply to this email directly or view it on GitHub: https://github.com/geneontology/amigo/issues/290#issuecomment-173315611

cmungall commented 8 years ago

That said, I would be interested in knowing what a user would need that differentiation for

ontologists, annotators and users interested in the ontology as an end unto itself in planteome use their amigo instance more than their counterparts in GO currently do.

kltm commented 8 years ago

@austinmeier Technically possible, but not a good choice--this causes both a metadata problem (extra fields and mechanisms to track which fields are synonyms in different layers) and it potentially codes in a quirk from OBO format. The eventual solution to this would be a metadata blob as mentioned here https://github.com/geneontology/amigo/issues/290#issuecomment-173022133.

@cooperl09 / @cmungall Okay, thank you for the explanations. I'm not arguing against (the solution would be a blob field as above), but I'm still wondering about this use case--I want to make sure I've absorbed the desired outcome to ensure that the necessary design changes don't need to be revisited later on.

The idea here then is to capture a more ontology browser aspect into the AmiGO term pages? In the case of GO, if somebody is trying to figure out what term they want to use for an annotation, the synonyms help direct the user to the right location for getting that information. In the case of PO, if a user has searched for "tree" and ends up at PO:000003, there is additional information for them in that they got to PO:0000003 via "tree" which is noted as "narrow"? Or is there a case here where because they find that "tree" is "narrow" they may not want to be using PO:0000003?

cmungall commented 8 years ago

On 20 Jan 2016, at 14:47, kltm wrote:

@cooperl09 / @cmungall Okay, thank you for the explanations. I'm not arguing against (the solution would be a blob field as above), but I'm still wondering about this use case--I want to make sure I've absorbed the desired outcome to ensure that the necessary design changes don't need to be revisited later on. The idea here then is to capture a more ontology browser aspect into the AmiGO term pages? In the case of GO, if somebody is trying to figure out what term they want to use for an annotation, the synonyms help direct the user to the right location for getting that information. In the case of PO, if a user has searched for "tree" and ends up at PO:000003, there is additional information for them in that they got to PO:0000003 via "tree" which is noted as "narrow"? Or is there a case here where because they find that "tree" is "narrow" they may not want to be using PO:0000003?

In theory the text definition should be sufficient for the biocurator.

However, it's good to show it for biocurators. The main reason is that it can be confusing to see a narrow or broad synonym and not have it indicated as such. It could lead to a lack of confidence that they have the right ID. Someone who is working purely in (say) moss may be bamboozled by seeing "tree" as a syn for PO:0000003, they may think they have come to completely the wrong place.

The ontology developers have put a lot of thought into providing full metadata for the synonym, including the fact that it is not exact (as well as source of the synonym, type, etc). If the biocurator sees this in a fully transparent fashion then what was previously confusing will hopefully click for them. Eventually we'll have taxonomic metadata in the blob too, as we do in uberon (your tree example is good here). This will justify some contextually dubious syns even further.

kltm commented 8 years ago

@cmungall Thank you for the context. Having the meta data blob being more abstract, so the various handlers can be triggered as needed depending, will be something to keep in mind.

cooperl09 commented 8 years ago

@kltm I agree that in the case of the narrow synonyms, it may not be as important, in that the class is inclusive. But the "BROAD" scope synonyms are terms that are used in the literature and can be used to describe more than one class in the ontology.

Here are a couple examples:

id: PO:0025593 name: shoot-borne internode root

id: PO:0000043 name: crown root

id: PO:0025002 name: basal root

id: PO:0025050 name: tuber interfascicular region

id: PO:0025052 name: tuber pith

id: PO:0025061 name: tuber perimedullary zone

kltm commented 8 years ago

@cooperl09 Ah, okay--that makes a lot more sense. For my future self coming back to this: essentially (actually literally in OBOland), there's a tag on the term that says "oh yeah, the thing that you searched for that got you here? that's ambiguous and you may want to check out other classes with it". The downside of having this extra information in the JSON blob is that one wouldn't be able to specifically filter on it. Although that may be a problem better handled in another field.

cooperl09 commented 8 years ago

Yes, that is exactly the issue with the broad synonyms and possibly ones in the "Related" scope as well. It's hard for me to comment on this approach as I am not sure what the "JSON blob" is. We may be able to modify the synonyms from our end to show the scope, but that is not the ideal situation moving forward.

kltm commented 8 years ago

@cooperl09 Yes, that would not be a good long-term strategy, but it is probably the only one available immediately.

cooperl09 commented 8 years ago

Ok, thanks, keep us posted if you come up with a solution. Just wanted to note that the categories or scopes of synonyms are also present in the Protege version we are using, so it is not strictly an OE hangover.

kltm commented 8 years ago

@cooperl09 The solution will be as basically outlined here https://github.com/geneontology/amigo/issues/290#issuecomment-173022133. It will involve changes to the schema, loader code, and client code. The exact JSON structure is TDB, but likely very minimal. While the types of synonyms are not OBO-specific, the special four(?) in OBO are, which is one of the reasons we want to keep it general.

paolaroncaglia commented 8 years ago

I just came across the same issue (synonym scope not displayed in AmiGO) and saw this ticket. Just wanted to confirm, from the GO point of view, what Chris said - that being able to differentiate synonyms with different scope is important, and that GO editors spend time on getting the scope right. It’s not an insignificant loss not to have this functionality in AmiGO.

cmungall commented 8 years ago

TODOs:

  1. [X] Define blob JSON format: DONE https://github.com/geneontology/obographs
  2. [ ] Populate this blob when loading ontology_class documents - https://github.com/owlcollab/owltools/issues/160
  3. [ ] Write perl TT handler that can traverse the JSON object and render as HTML (or node or client-js)
cmungall commented 6 years ago

This ticket is being subsumed into #473