ModelSEED / ModelSEEDDatabase

This repository contains the definitive copy of the biochemistry and metadata used to construct models using the ModelSEED/ProbAnno approach
Other
52 stars 38 forks source link

Classifying aliases #109

Closed samseaver closed 5 years ago

samseaver commented 5 years ago

I've nearly finished sorting out aliases, I merged the key external aliases into a single file for compounds and reactions, and then used the 'source' field to capture whether or not the compound or reaction in question came from a Primary or Secondary database or a Published model, as seen within the file "Source_Classifiers.txt". Any entity that did not, or no longer has an appropriate entry in the alias file was labelled as an Orphan and any few that were manually added by our team (me) were labelled as User.

This is how it breaks down for compounds: 24283 Primary Database 1978 Secondary Database 866 Published Model 566 Orphan

and for reactions: 20239 Primary Database 7176 Secondary Database 5757 Published Model 1539 Orphan

samseaver commented 5 years ago

It passed checks, merging