broadinstitute / catch

A package for designing compact and comprehensive capture probe sets.
MIT License
76 stars 16 forks source link

Add ability to automatically download sequences for a taxonomy #27

Closed haydenm closed 5 years ago

haydenm commented 5 years ago

This PR adds an option to specify input in the form of an NCBI taxonomy ID. When specified this way, CATCH will fetch all accessions (genome neighbors) for the taxonomy, download sequences of those accessions, and use them as input.

This adds the ncbi_neighbors module to make calls to NCBI's Entrez system. To use this feature, the format for input datasets to design.py is download:TAXID. TAXID is an NCBI taxonomy ID. This PR also updates the README to describe this feature and changes the main example to use it.

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-0.5%) to 94.604% when pulling 30b9a53a9283fa8c8a14be66d7484a2955ef3a24 on download-seqs into bf97305e06afea15b8d3898d00771f89b75567a4 on master.

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-0.5%) to 94.604% when pulling 30b9a53a9283fa8c8a14be66d7484a2955ef3a24 on download-seqs into bf97305e06afea15b8d3898d00771f89b75567a4 on master.