Open hattrill opened 1 year ago
I haven't looked/updated that sheet for quite a while.... will try to do so this afternoon.
I have been doing regular cross-checks between GO annotations and the enzyme gene groups with each release - based on those checks, I believe all the enzymatic 'contributes to' annotations are now correct (or as correct as they can be). On that basis, I believe we can now drop the 'contributes to' block on EC annotations. But would be good to update/check that separate google sheet you mention above.
That sounds good - if you are happy, I am happy.
You've got a much better handle on this. So go ahead with plan for GenBank submission and dropping the ban on contributes_to EC. Hopefully, IU can easily modify their pipeline to mirror this.
I am going to add some simple counts to my GAF checking to monitor changes.
awk '{print $4}' gene_association.fb > qualifiers; grep "NOT" qualifiers | wc -l; grep "contributes_to" qualifiers | wc -l; 603 503
I regenerated the list of contributes_to annotations based on FB2022_03, filtered out all non-enzymatic annotations and did a quick analysis of what was left: https://docs.google.com/spreadsheets/d/1UG8HJe_OP7tksCYGa97aOHGizyN1vTD2Pl_TQAn2Jbg/edit?pli=1#gid=2102854116
Computing EC annotations based on 'contributes to' annotations would:
Overall, I can live with the EC:7.1.2.2 propagation (it's not completely wrong, and it's a consequence of how GO manages those terms). So I will ask that 'contributes to' annotations are now considered in the EC computation scripts.
Check that contributes_to have been fixed, so that we can discontinue block to EC numbers from DB ticket for EC and Web ticket Sheet for checking made by SM.
SM gave http://flybase.org/reports/FBgg0001650.html as an example of where we could assign ECs to gene products with contributes _to and it would be correct.
Also