Closed tyleraland closed 8 years ago
@crosenth - is this still an issue, or has the behavior been updated since Tyler opened this?
Yes:
positional arguments:
infile Input CSV file to process, minimally containing the
fields tax_id
. Rows with missing tax_ids are left
unchanged.
...
What I want to do is to parse some BLAST output, which gives me one or many staxids per hit, and map each sequence to its species-level taxid
Here is my ideal workflow:
Unfortunately, rarely, BLAST will give me a taxid not in their taxonomy database (and thus not in the taxdb) so I need to filter it out.
As a consequence, I need to use update_taxids to first clean the taxids list. However, update_taxids requires that my taxids list be a seq_info csv mapping seqname columns to tax_id columns. This information doesn't really make sense in my case (a single seqname may map to multiple taxids, but taxtable doesn't need that information). So three extra steps are required:
As far as I can tell, the seq_name portion of the seq_info file is not used, but it is required. If update_taxids allowed a list of taxids it would simplify the interface.