fhcrc / deenurp

16S rRNA gene sequence curation and phylogenetic reference set creation
GNU General Public License v3.0
4 stars 3 forks source link

misclassification of unclassified sequences in rdp_extract_genbank #27

Closed crosenth closed 9 years ago

crosenth commented 9 years ago

see https://github.com/fhcrc/deenurp/blob/master/deenurp/subcommands/rdp_extract_genbank.py#L31

Need a clause limiting results of query to scientific_name = 1

For example see: ncbi_taxonomy.db 'select * from names where tax_id = 1650571'

crosenth commented 9 years ago

where is_primary=1

crosenth commented 9 years ago

Issued solved by updating tax_ids from Genbank records.