geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
34 stars 10 forks source link

TreeGrafter issue PANTHER:PTHR23069:SF10 PAINT PANTHER:PTN000552695 #5044

Open ValWood opened 6 months ago

ValWood commented 6 months ago

UniProtKB:O14114 | abo1 | involved_in | GO:0045944    positive regulation of transcription by RNA polymerase II | ECO:0000318   IBA | GO_REF:0000033 | PANTHER:PTN000552695 more... | 284812 Schizosaccharomyces pombe (strain 972 / ATCC 24843) | GO_Central

UniProtKB:O14114 | abo1 | involved_in | GO:0045944    positive regulation of transcription by RNA polymerase II | ECO:0007826   IEA | GO_REF:0000118 | PANTHER:PTHR23069:SF10 | 284812 Schizosaccharomyces pombe (strain 972 / ATCC 24843) | TreeGrafter

This is "[transcription initiation-coupled chromatin remodeling]

also abo2

pgaudet commented 1 month ago

Hi @ValWood As far as I can tell this is now gone. In fact we are now filtering all TreeGrafter annotations from the 143 species manually annotated in PAINT. Can this (and other Tree Grafter issues) be closed?

Thanks, Pascale

ValWood commented 1 month ago

Hi @pgaudet It is gone from pombe (but that would happen anyway because I filtered them). But I wasn't really reporting them because of pombe. I was reporting them for the benefit of other species annotated by TreeGrafter.

The problem is that these annotations are still transferred to all of the under annotated species. For example this one is on https://www.uniprot.org/uniprotkb/A0A0W4ZHJ6/entry but now it is not so easy to spot the problem ones because we no longer see them on the curated models.

I admit that this particular annotation isn't the biggest problem error, because the annotation should be removed at source and so the annotation should disappear over time, but a lot of the errors I reported were due to incorrect annotation transfer by TreeGrafter. Do we know that these issues were fixed before we close the tickets?

The purpose of TreeGrafter is to provide accurate annotation for the unannotated species and so it should be the same quality as the PAINT annotation or it defeats the objective?

pgaudet commented 1 month ago

It looks like we have a problem with data being out of date - this IBD was removed by Marc last November, but for some reason this is what InterPro picked up.

@huaiyumi @dustine32 Can we discuss this on a future PAINT call?

ValWood commented 1 month ago

I should explain my concern. I think @thomaspd mentioned that we would/could recommend to the function prediction projects (i.e CAFA etc) to use PAINT/TreeGrafter for benchmarking. However, the TreeGrafter error rate was pretty high for this. I reported over 100 issues in tickets and a Google sheet. As far as I know the only ones that will be fixed are those which now have "do not annotate" or taxon constraints but this will only be a small subset. Mostly they were correct in PAINT.