fhcrc / taxtastic

Create and maintain phylogenetic "reference packages" of biological sequences.
GNU General Public License v3.0
21 stars 10 forks source link

Fast taxtable #87

Closed crosenth closed 7 years ago

crosenth commented 7 years ago

About a month's worth of work for two goals, speed up taxit taxtable and additional functionality for taxit add_nodes. One update is the ncbi_taxonomy.db is a required positional argument rather than an optional argument for many for subcommandas new_database, update_taxids and add_nodes. There is also a new RANKS table in the ncbi_taxonomy.db which holds a hierarchy of taxonomic ranks.

crosenth commented 7 years ago

Need to generate a report of all the intermediate ranks above species (below_ ranks) to decide if we should just drop them entirely and reset the effected nodes' parent_ids.

crosenth commented 7 years ago

Generate some taxtables between this and master and do some md5sums to make sure the output is exactly the same.