Open ctb opened 2 years ago
Hey, I think maybe add -l "complete genome"
to
genome_updater.sh -T 22 -d "genbank" -f "genomic.fna.gz" -o shew -t 6
will be better?
This will lower download size (a lot), and the complete genome can represent species better.
I wanted to build a database with custom k-size values containing all genomes under taxid Shewanella.
First, I installed https://github.com/pirovc/genome_updater:
Then I ran:
which produced
shew/2022-05-06_13-58-43/
.Next, I ran:
(using a checkout of https://github.com/sourmash-bio/database-examples at this release.)
This resulted in the following output:
Using this, I could then run
sourmash sketch fromfile shewanella.csv -p <my parameters here> ...