pombase / canto

The PomBase community curation tool
https://curation.pombase.org
Other
19 stars 7 forks source link

How should we be capturing 'disease" #2099

Closed ValWood closed 3 years ago

ValWood commented 5 years ago

This is currently asdded as an extension to virulence /pathogenicity causing pathogen-host interaction phenotypes. However, this means captuing the same thing over again. It seems that we should capture this as a different annotation type ...

I think we have discussed this before, but I can't remember the outcome @CuzickA ?

@jseager7 could this be added to the next agenda?

jseager7 commented 5 years ago

The plan was to automatically fill the disease names based on a curated list, and only have the user supply a disease if it differs from the expected disease. See https://github.com/pombase/canto/issues/1975.

I'm not sure if we ever decided exactly how the user would supply the disease name in the case where it differs from the expected disease. Normally that would be done with the 'disease_caused' annotation extension but we decided to disable that because of the aforementioned repetition in curation.

As for the repetition of novel disease names (ones that aren't in our list), it's kind of inevitable if we associate the disease with a particular interaction, because some interactions will cause disease and some won't – we could maybe avoid that if we view phenotype outcomes like 'abolished pathogenicity' to imply 'no disease', meaning the phenotype can be used to infer disease presence or absence. That way we can pin the disease to the metagenotype and let the phenotype annotations sort out whether there's disease or not. That could be tricky to implement though.

jseager7 commented 5 years ago

Note also that because of changes in #1975, the disease annotation extension is currently disabled on the PHI-Canto servers (well it should be disabled, anyway). Feel free to let me know if you ever need it re-enabled so data doesn't get left out. I could restrict it to admin-only as a 'middle-ground' solution.

jseager7 commented 4 years ago

I've re-enabled the disease annotation extension following some further discussion. The plan is to use that for now, until either the disease ontology is ready, or until we can decide on a way to capture diseases as a separate annotation type.

@ValWood If it's annoying having to record the same disease repeatedly, one option might be record the disease only once for each (host, pathogen, tissue type) triple – then later on in the pipeline (or maybe even in Canto itself) we could use some basic inference rules to apply that same disease to every interaction that matches the triple. If there's no recorded disease for a triple, then we don't do anything.

Unfortunately, the above approach could run into problems if you happen to curate two or more diseases for a triple: then we have no way of knowing which is the 'ordinary' disease and which is the unusual one, meaning we don't know which disease should apply to the other un-annotated phenotypes. I can't really think of any solution to this when using annotation extensions, besides just filling in all the disease annotations manually when you can't automatically populate them.

jseager7 commented 4 years ago

It's possible that the disease that manifests could also depend on experimental conditions, like which chemical substances are present in the environment. That could complicate things further.

ValWood commented 4 years ago

Let's see how we go. I'm sure there is a better more consistent way to do annotate/infer disease. Hopefully it will become clearer later. It doesn't bother me to capture multiple times (but I won't be doing the bulk of the curartion) ....and I guess it's only for terms which are disease symptoms.

jseager7 commented 3 years ago

Closed in favour of #2390.