dictyBase / Migration

Entrypoint for dictybase overhaul project
0 stars 0 forks source link

GO: Smart filter IEAs and possibly others #107

Closed pfey03 closed 5 years ago

pfey03 commented 6 years ago

After GO migration, when we finally have the up to date information in the database, I will look into our annotations and we should definitely filter IEAs, maybe others. Pombase filters a lot and their users seem to like it, see recent comment from Val in this issue https://github.com/geneontology/go-annotation/issues/1879

pfey03 commented 6 years ago
tagb annotations
pfey03 commented 6 years ago

The images above and below are examples of what I would like to filter with smart filtering of IEAs. I chose P2GO snapshots as they nicely separate Manual and IEAs.

1) Image above: When there is good experimental, in each aspect, we should only have those that are non redundant. In the above example, there are good BP experimentally. But IEAs have 'proteolysis' and 'transmembrane transport', and those we could leave, but non redundant, see next.

2) Filter IEAs that are redundant within each other. In the image above we have a ton of redundancy in the electronic MF annotations. You see two 'ATP binding' annotations (red box) and a parent of that term 'nucleotide binding' (blue box). So we should only retain ONE 'ATP binding. Same is with 'peptidase activity', 'serine-type peptidase activity', and 'serine-type endopeptidase activity' and we only should retain the last most specific. Thus the smart rule needs to know the GO hierarchy to know what's within each path.

3) Filter IEAs redundant with EXP. There is no example in above figure, but in the in the image below. There is one CC annotation 'IDA extracellular region'. In the IEAs CC annotations there is the same term twice (red boxes) and these two IEAs should be filtered.

Note the GOC term 'signal transduction' is highlighted green in P2GO because it was inferred from a manual MF annotations 'signal transducer activity' but this term will be obsoleted and I deleted the annotation, hence the inferred IEA will be deleted in the next update and is marked automatically

pfey03 commented 6 years ago
cf45-1 annotations
pfey03 commented 5 years ago

replaced by this issue https://github.com/dictyBase/Migration/issues/111