pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

QC: filter *EXACT* duplicate annotation from UniprotKB #193

Closed pombase-admin closed 6 years ago

pombase-admin commented 11 years ago

If an annotation from UniPROT KB has the exact same annotation as a Pombase one (publication/evidience code). Please filter the uniProt annotation.

Kim, next time you grab the GOA file, could you let me know: i) How many EXP annotations we get from GOA? (IMP/IDA/IGI) ii) How many remain after filtering (this will be possible from the GO GAF after the asigned by is corrected, so we can do this)

.....we could exclude the Uniprot EXP altogether if we had them fully covered. I know I only used to pick up a 100 or so after filtering (probably lower now), so if we made sure these papers were covered we could just have a log file for Uniprot annotations which are not covered by our annotations,and decide whether they are appropriate to include. I'm suggesting this because it will be a diminishing set...I don't think Uniprot curators do pombe papers anymore as they have lots of uncared for fungi to look after.

v

Original comment by: ValWood

pombase-admin commented 11 years ago

Original comment by: ValWood

pombase-admin commented 11 years ago

i) How many EXP annotations we get from GOA? (IMP/IDA/IGI)

I meant UniProt not GOA, I just checked the GAF, and from Uniprot manual curation we pull in 97 ISS and 151 experimental....I spot checked 10 and they appear largely to be exact duplicates)

So, step 1 filter any annotation with a UniProt source if we have the same annotation from another source (regardless of evidence code). This will get rid of the majority. Then I will check if the remaining ones are Uniprot errors or things we are missing.

Original comment by: ValWood

ValWood commented 6 years ago

This appears to be an exact duplicate of https://github.com/pombase/pombase-chado/issues/536