FlyBase / drosophila-phenotype-ontology

The home of the Drosophila phenotype ontology
Creative Commons Attribution 4.0 International
4 stars 1 forks source link

obsoletion of GO:0008219 'cell death' #162

Open Clare72 opened 1 year ago

Clare72 commented 1 year ago

GO are planning to obsolete their 'cell death' term, which is the basis of the 'abnormal cell death' phenotype and increased/decreased children: https://github.com/geneontology/go-ontology/issues/24680 Basically, they only want to represent programmed cell death in GO and not cell death by any possible means.

This phenotype is used for A LOT of annotations I guess it is not really feasible to review them all and only keep the ones that involve programmed cell death!

Other options:

Either of these might affect the position of our phenotype in the upheno hierarchy. Other phenotype ontologies are affected too, so this will be discussed further on future upheno calls - agenda

@gm119 @arzuozturk @hattrill

hattrill commented 1 year ago

Yikes! I guess it has been used in the sense of "these cells are dead man" rather increased/decreased necrosis/apoptosis. I would like to look at a sample of our papers and see what the general usage is.

gm119 commented 1 year ago

here is the relevant doc from the phenotype manual:

"Mutant tissue/clones show increased/decreased cell death, but not totally absent/lethal (evidence can be cell morphological as well as experiments like TUNEL staining)" - use one of increased cell death, decreased cell death

(if they were vague and just said the amount of cell death is different from normal, then that is when the parent 'abnormal cell death' would be used instead).

and here is current definition of 'abnormal cell death' term in case it helps "Phenotype that is a change in the amount of cell death in a whole animal or in some specific organ tissue or clone of cells compared to wild-type. This may be due to effects on the regulation of cell death (GO:0010941) or in cell death (GO:0008219) pathways themselves. "

re this option:

Do something strange like switch to the abnormalBiologicalProcessInLocation pattern and use GO "death" in FBbt "cell".

to me, I think that is pretty much exactly describing what the term means, i.e. it is supposed to be used when there is an abnormal amount of cell death in a tissue, so it doesn't seem strange (!), but I get that it might be wierd ontology-wise

ValWood commented 1 year ago

This is part of the problem. Poeple have been using the GO cell death terms to make GO annotations to phenotypes. Also groups seem to be using the GO cell death term in their phenotype ontologies ( which is fine for programmed cell death).

But your phenotype annotations should still be sound, you just need to make different logical defs? In FYPO we use

'has part' some ('lethal (sensu genetics)' and ('characteristic of' some 'fungal cell'))

to define cell lethality Does that help?

hattrill commented 1 year ago

Hi @ValWood it's more that the GO term has been used in the logical def & def of "abnormal cell death", not that GO cell death terms have been used to make GO annotations to phenotypes.

There are a lot of obsoletions happening in GO at the moment. I will discuss with FB how these obsoletions are likely to impact on our other annotation processes. In adition, we have in the past replaced some of our fly specific ontology terms in annotation with ones from the GO as part of good practice in not duplicating the same terms that alredy exist. While it used to be fairly straight forward to re-annotate when obsoletions were made in GO, I think that it is perhaps getting harder as we are getting very interlinked "under the hood" without the infrastructure to "see" it.

matentzn commented 1 year ago

I agree - mass obsoletions with, but in particular without, replacements are extremely costly and deter us from spending our already limited resources on greater issues.. I think there should be an additional burden on an obsoletion decision that recognises the huge mutual interdependence that comes with promoting and facilitating re use. Let's be clear: every go term that is obsoleted cost the taxpayer thousands of dollars for all the curation efforts that need to be updated..

ValWood commented 1 year ago

Hi Nico,

Are you referring to GO reannotation or reannotation of phenotypes? Phenotypes that are defined using the GO terms should still be correct, but an alternative way to logically define the phenotype needs to be identified where the concept is within scope. Presumably PATO can be extended to include terms which are out of scope for GO? PATO already has terms to cover concepts like viability?

From the GO perspective, sure there is a cost in annotation, but we are only obsoleting GO terms to improve the ontology and the annotation so that the biology is correctly represented.

First, it is worth pointing out that, as somebody who has routinely curated both phenotypes and GO terms daily for almost 2 decades, and has been involved in building curation systems to enable the curation of both GO and phenotypes by the community (hundreds of individuals) phenotypes are much, much easier for most curators to describe and curate than GO terms because they are observations, you say what you see.

Curating and modelling GO is more difficult because GO is (and has been since its inception) for describing evolved/normal/non-pathogenic processes. However, we are mostly modelling complex biological processes with incomplete knowledge, like the blind man and the elephant. A major consequence of this, experiments can be incorrect, misinterpreted or over-hyped. Often the experimenter's interpretation of what we are looking at improves over time. Most problem arise because phenotypes usually precede mechanistic detail. Often researchers will imply that a gene product is involved in a process, or regulating a process rather than just 'required for it' to occur and this can result in the curation of very indirect upstream effects as part of a process, and the instantiation of terms that do not represent normal biology.

These unavoidable problems result in curators requesting GO terms which do not align with how a pathway would be logically represented, single-steps from different processes get conflated into a single term (the modification terms are an example of this). Over time we have accumulated lot of GO terms that should not be in the gene ontology and a lot of annotations that are misleading.

Since the model species databases have mostly annotated most genes (although not most publications!), it is clear that annotation review will form an increasingly important component of our work as outdated information and modelling are corrected. We describe this in both of the recent consortium publications. Of course all curators (including me) will complain about the need to revisit legacy annotations, but all curators of functional data also know that revising legacy incorrect curation to align with current knowledge is one of the most important parts of the biocuration task.  Also, it does not take so long; an experienced curator can usually quickly judge an annotation as legacy based on more recent curation, assess the 'author intent' from the abstract of a paper with hindsight, or quickly home in on the section of a paper describing a biological concept to make a more informed current judgement about a legacy annotation.

Historical terms like “cell death” without qualifying the the cell death must be evolved should not have been added to GO. There are also many examples where "regulation" terms have been added to GO , purely to build (for example) FYPO logical definitions when it was not clear that a specific process was regulated at all, basically outsourcing to GO the problem of logically defining a term, even where the term is out of scope or non-existant.

These ontology and annotation changes are not inconsequential. They improve analysis and change the interpretation of future experiments. They also, importantly improve annotation transfer to thousands of unannotated species. More recently the legacy “phenotype-type” annotations in GO are confounding predictions fields ML and AI efforts, where many of the predictions lead back to these problematic terms which do not provide meaningful annotations in the GO sense (modification, ageing, cell death)

OBSOLETION GO terms are never obsoleted without reason, usually they should not have been added. GO terms are obsoleted when they are i) out of scope for GO (i.e. orthogonal to other concepts which do not align with modelling processes as ordered assemblies of molecular functions, or describing phenotypes as opposed to processes) or ii) have been used inconsistently to represent different concepts and so annotations need to be reviewed and either removed (if they represent some indirect observation or readout) , or annotated to the correct replacement term. The obsoletions are required to ensure that biology is described consistently and accurately. Usually the replacements for the annotations to these term need to be dealt with differently (a big red flag that there was a problem), therefore there is no suggested term replacement, or only 'consider' terms.

Of course this is an inconvent problem, but it is absolutely necessary to move towards a better representation of biology and to support curators to annotate more consistently, and make future curation easier. Fixing these issues now will cost LESS money than kicking the can down the road and letting more annotations accumulate.

hattrill commented 1 year ago

Polite request: Please could we restrict this to a discussion on FlyBase's issue with cell death in phenotype annotation?

hattrill commented 1 year ago

Hi @Clare72 @gm119

Having a look at our phenotype annotations, we have 976 papers with increased/descreased/abnormal cell death (3457 lines in the pheno file). I have had a quick spot check of 10 and I can make the call to 'apoptosis' being the cell death that they are looking at pretty quicky by looking at the abstract - have one I need to do some reading around.

If we want to make the pheno term more specific "XXX programmed cell death", I think that if we (about 4 of us, that is) could split the job we could review it pretty quickly.

I am open to that option - do you want to discuss it on Wednesday in the group meeting slot?

ValWood commented 1 year ago

RIght, I. thought this was the GO tracker!

hattrill commented 1 year ago

RIght, I. thought this was the GO tracker!

That's ok, Val. I do agree with reason for obsoletion in the GO, just a little worried about how these inter-dependencies are starting to bite in other areas (e.g. GO to Uberon and taxon violations, for example).

Clare72 commented 1 year ago

do you want to discuss it on Wednesday in the group meeting slot

Sure - if it seems that the vast majority of annotations are programmed cell death, maybe we can switch the GO term and modify the definition. I don't really like narrowing definitions like this, but could end up being the lesser evil in this case.

hattrill commented 1 year ago

Yes, I see what you mean with the issue in narrowing the def . The impact on apoptosis may well be a secondary consequence and so may be misleading in some cases and how do we capture the cases where we don't know?

gm119 commented 1 year ago

yes, lets discuss on wednesday. i am not sure we'll always we able to say that the phenotype is programmed cell death - for some papers, they literally say something like 'cell death in the imaginal disc is increased compared to controls' with no other observation (thats the thing about phenotype curation, its just recording the observed phenotype, without always knowing what the reason behind the observed phenotype is). This is particularly true of genetic interaction screens where they are trying to identify members of the same pathway etc. - there is often less detail about the phenotype for those screens.

It would be a shame if we were no longer able to capture this type of phenotype - I think its still useful for users to be able to group alleles that have a common phenotype of a change in amount of cell death, even if its not know what type of cell death is occurring.

Having said that, I don't know enough about the standard assays like acridine orange staining etc. to know whether or not there is an easy mapping between the type of assay used, and the type of cell death that has occurred. If there is a simple mapping, then perhaps making the definition of the phenotypic class more specific to the type of cell death being affected, is a possibility - but another factor to take into account is that we want this phenotypic class term to be simepl to use without the curators having to look up the details of various kind of cell death etc. each time they need to use it).

ValWood commented 1 year ago

it's more that the GO term has been used in the logical def & def of "abnormal cell death", not that GO cell death terms have been used to make GO annotations to phenotypes.

So in this case, if the intended meaning of the term was any sort of cell death, evolved or pathological/abnormal, isn't it better to use a different logical definition to encompass both types of death. This is the crux of the issue for why this term does not belong in GO. In FYPO we do not use the GO term cell death to define "any sort of cell lethality". we use "lethal" from PATO

i.e 'has part' some ('lethal (sensu genetics)' and ('characteristic of' some 'fungal cell'))

Although it is possible that PATO also uses the GO cell death behind the scenes, in which case this would need to be fixed too.

ValWood commented 1 year ago

Presumably you also want to curate phenotypes for all sorts of non evolved cell death - i.e cell death in response to cyto toxicity, various stresses, disease etc. etc. Some of this will be programmed, but a lot of it won't be...

Clare72 commented 1 year ago
Screenshot 2023-01-30 at 10 33 23

The current GO hierarchy allows various cell death phenotypes to be nicely grouped together. Changing the logical definition will break this, unless we change ALL cell death phenotypes to a different pattern