geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
222 stars 40 forks source link

flag usage of obsolete EC xrefs #17844

Closed rwst closed 1 year ago

rwst commented 5 years ago

Obsoletion in enzyme.dat has the form

//
ID   1.1.1.139
DE   Transferred entry: 1.1.1.21.

or

//
ID   3.4.13.16
DE   Deleted entry.

Travis should grep the OBO file for usage of "EC:x.y.z.w" with the numbers coming from a list.

rwst commented 5 years ago

This is the script to produce the list: grep -C1 'DE\s\+\(Deleted\|Transferred\)' enzyme.dat |grep '^ID' |sed 's/^ID\s\+//g'

rwst commented 5 years ago

Using this I find 1297 lines referring to obsolete EC numbers, all in def or synonym statements. The 269 unique numbers referred to are: 1.10.2.2, 1.10.9.1, 1.10.99.1, 1.10.99.2, 1.10.99.3, 1.1.1.128, 1.1.1.158, 1.1.1.161, 1.1.1.246, 1.1.1.63, 1.12.99.2, 1.13.11.13, 1.13.11.32, 1.13.11.44, 1.13.12.11, 1.13.12.12, 1.13.12.14, 1.1.3.3, 1.1.4.1, 1.14.11.14, 1.14.11.19, 1.14.11.22, 1.14.11.23, 1.14.12.4, 1.14.12.5, 1.14.13.100, 1.14.13.104, 1.14.13.108, 1.14.13.109, 1.14.13.11, 1.14.13.117, 1.14.13.118, 1.14.13.12, 1.14.13.13, 1.14.13.132, 1.14.13.137, 1.14.13.138, 1.14.13.139, 1.14.13.140, 1.14.13.141, 1.14.13.142, 1.14.13.143, 1.14.13.144, 1.14.13.145, 1.14.13.15, 1.14.13.17, 1.14.13.21, 1.14.13.26, 1.14.13.28, 1.14.13.3, 1.14.13.30, 1.14.13.36, 1.14.13.37, 1.14.13.41, 1.14.13.42, 1.14.13.47, 1.14.13.48, 1.14.13.52, 1.14.13.53, 1.14.13.55, 1.14.13.56, 1.14.13.57, 1.14.13.60, 1.14.13.67, 1.14.13.68, 1.14.13.70, 1.14.13.71, 1.14.13.72, 1.14.13.73, 1.14.13.74, 1.14.13.75, 1.14.13.76, 1.14.13.77, 1.14.13.79, 1.14.13.80, 1.14.13.85, 1.14.13.86, 1.14.13.87, 1.14.13.88, 1.14.13.89, 1.14.13.90, 1.14.13.91, 1.14.13.93, 1.14.13.94, 1.14.13.95, 1.14.13.96, 1.14.13.97, 1.14.13.98, 1.14.13.99, 1.14.15.2, 1.1.4.2, 1.14.21.1, 1.14.21.2, 1.14.21.3, 1.14.21.4, 1.14.21.5, 1.14.21.6, 1.14.99.10, 1.14.99.27, 1.14.99.28, 1.14.99.3, 1.14.99.30, 1.14.99.31, 1.14.99.32, 1.14.99.33, 1.14.99.36, 1.14.99.9, 1.1.5.6, 1.17.1.2, 1.17.99.1, 1.17.99.5, 1.1.99.10, 1.1.99.23, 1.1.99.8, 1.2.1.1, 1.2.1.2, 1.2.1.40, 1.2.1.43, 1.2.1.66, 1.2.2.2, 1.2.2.3, 1.2.3.11, 1.2.7.2, 1.2.99.2, 1.2.99.3, 1.2.99.4, 1.2.99.5, 1.3.1.23, 1.3.1.26, 1.3.1.30, 1.3.1.35, 1.3.1.4, 1.3.1.52, 1.3.1.59, 1.3.1.63, 1.3.3.1, 1.3.3.9, 1.3.99.1, 1.3.99.10, 1.3.99.13, 1.3.99.21, 1.3.99.22, 1.3.99.3, 1.3.99.7, 1.4.99.1, 1.4.99.3, 1.5.1.12, 1.5.1.29, 1.5.1.35, 1.5.3.11, 1.5.99.11, 1.5.99.8, 1.5.99.9, 1.6.6.9, 1.6.99.5, 1.7.3.4, 1.7.99.8, 1.8.99.1, 1.8.99.3, 1.97.1.10, 1.97.1.11, 1.97.1.3, 1.97.1.8, 1.9.99.1, 2.1.1.124, 2.1.1.125, 2.1.1.126, 2.1.1.138, 2.1.1.149, 2.1.1.29, 2.1.1.31, 2.1.1.32, 2.1.1.36, 2.1.1.48, 2.1.1.51, 2.1.1.52, 2.1.1.66, 2.3.1.104, 2.3.1.119, 2.3.1.128, 2.3.1.154, 2.3.1.70, 2.3.1.88, 2.3.1.96, 2.4.1.119, 2.4.1.130, 2.4.1.157, 2.4.1.163, 2.4.1.45, 2.4.1.57, 2.4.1.95, 2.4.2.23, 2.4.99.11, 2.5.1.11, 2.6.1.68, 2.7.1.69, 2.7.7.21, 2.7.7.25, 2.7.7.54, 2.7.7.55, 2.7.8.25, 2.8.3.7, 3.1.1.21, 3.1.2.15, 3.1.2.26, 3.1.27.1, 3.1.27.2, 3.1.27.4, 3.1.27.5, 3.1.27.6, 3.1.27.9, 3.1.3.13, 3.2.1.110, 3.3.2.5, 3.4.13.3, 3.4.21.87, 3.5.1.27, 3.5.3.19, 3.5.4.14, 3.5.99.3, 3.5.99.4, 3.6.1.19, 3.6.1.30, 3.6.1.47, 3.6.1.48, 3.6.3.10, 3.6.3.14, 3.6.3.15, 3.6.3.16, 3.6.3.17, 3.6.3.25, 3.6.3.31, 3.6.3.47, 3.6.3.48, 3.6.3.49, 3.6.3.50, 3.6.3.51, 3.6.3.52, 3.6.4.1, 3.6.4.11, 3.6.4.2, 3.6.4.3, 3.6.4.4, 3.6.4.5, 3.6.4.8, 3.6.4.9, 4.1.1.41, 4.1.1.70, 4.1.2.30, 4.1.2.37, 4.2.1.4, 4.2.1.52, 4.2.1.58, 4.2.1.60, 4.2.1.61, 4.2.1.89, 4.2.3.14, 4.3.1.11, 5.2.1.3, 5.3.3.15, 5.4.1.2, 5.4.2.1, 5.99.1.2, 6.1.1.25, 6.2.1.29, 6.3.2.15, 6.3.2.21, 6.3.2.27, 6.3.2.28

rwst commented 5 years ago

I also find 236 unique obsolete EC numbers in 251 xref statements: 1.10.2.2, 1.10.9.1, 1.10.99.2, 1.10.99.3, 1.1.1.128, 1.1.1.158, 1.1.1.161, 1.1.1.246, 1.13.11.13, 1.13.12.12, 1.1.3.3, 1.1.4.1, 1.14.11.14, 1.14.11.19, 1.14.11.22, 1.14.11.23, 1.14.12.4, 1.14.12.5, 1.14.13.100, 1.14.13.104, 1.14.13.108, 1.14.13.109, 1.14.13.11, 1.14.13.112, 1.14.13.117, 1.14.13.118, 1.14.13.12, 1.14.13.13, 1.14.13.132, 1.14.13.134, 1.14.13.136, 1.14.13.137, 1.14.13.138, 1.14.13.139, 1.14.13.140, 1.14.13.141, 1.14.13.142, 1.14.13.143, 1.14.13.144, 1.14.13.145, 1.14.13.15, 1.14.13.157, 1.14.13.17, 1.14.13.173, 1.14.13.192, 1.14.13.193, 1.14.13.21, 1.14.13.26, 1.14.13.28, 1.14.13.36, 1.14.13.37, 1.14.13.41, 1.14.13.47, 1.14.13.48, 1.14.13.49, 1.14.13.52, 1.14.13.53, 1.14.13.55, 1.14.13.56, 1.14.13.57, 1.14.13.60, 1.14.13.67, 1.14.13.68, 1.14.13.70, 1.14.13.71, 1.14.13.72, 1.14.13.73, 1.14.13.74, 1.14.13.75, 1.14.13.76, 1.14.13.77, 1.14.13.78, 1.14.13.80, 1.14.13.85, 1.14.13.86, 1.14.13.87, 1.14.13.88, 1.14.13.89, 1.14.13.90, 1.14.13.91, 1.14.13.93, 1.14.13.94, 1.14.13.95, 1.14.13.96, 1.14.13.97, 1.14.13.98, 1.14.13.99, 1.14.15.2, 1.1.4.2, 1.14.20.2, 1.14.21.1, 1.14.21.2, 1.14.21.3, 1.14.21.4, 1.14.21.5, 1.14.21.6, 1.14.99.10, 1.14.99.27, 1.14.99.28, 1.14.99.3, 1.14.99.30, 1.14.99.31, 1.14.99.32, 1.14.99.33, 1.14.99.36, 1.14.99.9, 1.1.5.6, 1.17.1.2, 1.17.99.1, 1.17.99.5, 1.1.99.10, 1.1.99.8, 1.2.1.2, 1.2.1.40, 1.2.1.43, 1.2.1.66, 1.2.2.3, 1.2.3.11, 1.2.7.2, 1.2.99.2, 1.2.99.4, 1.2.99.5, 1.3.1.30, 1.3.1.35, 1.3.1.4, 1.3.1.52, 1.3.1.59, 1.3.1.63, 1.3.3.1, 1.3.3.9, 1.3.99.1, 1.3.99.10, 1.3.99.13, 1.3.99.21, 1.3.99.22, 1.3.99.3, 1.3.99.7, 1.4.99.1, 1.4.99.3, 1.5.1.12, 1.5.3.11, 1.5.99.11, 1.5.99.8, 1.5.99.9, 1.6.6.9, 1.6.99.5, 1.7.3.4, 1.7.99.8, 1.8.99.1, 1.8.99.3, 1.97.1.10, 1.97.1.11, 1.97.1.3, 1.97.1.8, 1.9.99.1, 2.1.1.124, 2.1.1.125, 2.1.1.149, 2.1.1.51, 2.1.1.52, 2.1.1.66, 2.3.1.104, 2.3.1.119, 2.3.1.128, 2.3.1.154, 2.3.1.70, 2.3.1.88, 2.3.1.96, 2.4.1.119, 2.4.1.130, 2.4.1.157, 2.4.1.163, 2.4.1.45, 2.4.1.57, 2.4.1.95, 2.4.2.23, 2.4.99.11, 2.5.1.77, 2.6.1.68, 2.7.1.69, 2.7.7.21, 2.7.7.25, 2.7.7.54, 2.7.7.55, 2.7.8.25, 2.8.3.7, 3.1.2.15, 3.1.2.26, 3.1.27.1, 3.1.27.2, 3.1.27.4, 3.1.27.5, 3.1.27.6, 3.1.27.9, 3.1.3.13, 3.3.2.5, 3.4.13.3, 3.4.21.87, 3.5.1.27, 3.5.3.19, 3.5.4.14, 3.5.99.3, 3.5.99.4, 3.6.1.19, 3.6.1.30, 3.6.3.10, 3.6.3.16, 3.6.3.17, 3.6.3.25, 3.6.3.47, 3.6.3.49, 3.6.3.50, 3.6.3.51, 3.6.3.52, 3.6.4.3, 3.6.4.4, 3.6.4.5, 4.1.1.41, 4.1.1.70, 4.1.2.30, 4.1.2.37, 4.2.1.4, 4.2.1.58, 4.2.1.60, 4.2.1.61, 4.2.1.89, 4.2.3.14, 4.3.1.11, 5.3.3.15, 5.4.1.2, 5.4.2.1, 5.99.1.2, 5.99.1.3, 6.1.1.25, 6.3.2.27, 6.3.2.28

rwst commented 5 years ago

One way to fix some of the xrefs would be to take the current EC number from Rhea, which is apparently not done, e.g.:

id: GO:0047087
name: protopine 6-monooxygenase activity
def: "Catalysis of the reaction: H(+) + NADPH + O(2) + protopine = 6-hydroxyprotopine + H(2)O + NADP(+)." [EC:1.14.13.55, RHEA:22644]
synonym: "protopine 6-hydroxylase activity" EXACT []
synonym: "protopine,NADPH:oxygen oxidoreductase (6-hydroxylating)" EXACT [EC:1.14.13.55]
xref: EC:1.14.13.55
xref: KEGG_REACTION:R04699
xref: MetaCyc:1.14.13.55-RXN
xref: RHEA:22644

..but https://www.rhea-db.org/reaction?id=22644 has EC 1.14.14.98.

hdrabkin commented 5 years ago

Not every EC is referred to by Rhea. But, this has probably accumulated over time. It could be nice to have a monthly report of EC changes like this so that we aren't hit with thousands to fix at once! I can start on the xref usage (and hopefully hit the def ref while I'm there).

rwst commented 5 years ago

Note these are both Transferred and Removed entries so part of these have changed not disappeared. Also, if you just remove or change the xref lines I can automate that and do a pull request. No need to do this manually.

hdrabkin commented 5 years ago

If any are split I need to look at the split. And eventually we DO plan to cull RHEA for ECs, but this is not implemented yet. part of my doing this manually is that I also see if there is a RHEA and if so add it.

hdrabkin commented 5 years ago

Started; checked out 50 for rhea; replaced about 25 today.

rwst commented 5 years ago

I just see I picked many of deleted ECs from obsoleted GO entries, sorry. The actual number that are not in obsoleted entries is only a few dozens (of deleted ECs).

hdrabkin commented 5 years ago

no problem. working it through.

hdrabkin commented 5 years ago

up to 90 replaced thus far.

hdrabkin commented 5 years ago

I have done all of the long-hanging (simple replacement) ones including RHEA ids for ones that didn't have one to begin with. Now working on more complicated mappings.

rwst commented 5 years ago

Thanks for your work!

hdrabkin commented 4 years ago

Added a few more fixes

hdrabkin commented 4 years ago

I was informed today that apparently we have fixed all with obsoleted ECs. (based on a file from Expasy. I'll close this ticket. If we find others, I would prefer a new ticket to be made.

pgaudet commented 3 years ago

I still find may of those ECs in the ontology.

hdrabkin commented 3 years ago

I don't remember who told me these were all fixed; SO, what I want is the CURRENT list; not the old one. I don't so scripts. @balhoff , can you do something like this?

hdrabkin commented 3 years ago

Note that if the corrections were to be done computationally, attention would need to focus on the parentage of the GO term. If one of the parents were to one of our grouping terms mapping to a 3 digit ec, the parent would also need to be changed.

rwst commented 3 years ago

I had a first look at this. If I'm not mistaken, since the start of this ticket there were further 34 EC entries obsoleted. Of them the following 19 of them appear in a recent go.obo:

hdrabkin commented 3 years ago

Thanks @rwst . What we need is a monthly check on the status of our xrefs, especially EC and RHEA; so that we are faced with a lot at once. Of course, its dependent upon how often the public targets for the xrefs get changed.

hdrabkin commented 3 years ago

I'll play with the new set later today in my mechanic's waiting room.

pgaudet commented 1 year ago

Out of date - created new tickets from a new query:

There were 2 more deleted EC in the dbxrefs: https://github.com/geneontology/go-ontology/issues/24923 and https://github.com/geneontology/go-ontology/issues/24922 There are about 50 transferred EC: https://github.com/geneontology/go-ontology/issues/24927