DaehwanKimLab / centrifuge

Classifier for metagenomic sequences
GNU General Public License v3.0
235 stars 73 forks source link

How to download taxonomy for bacterial genome only #205

Open bioinfonext opened 3 years ago

bioinfonext commented 3 years ago

Hi,

I understand that I can download all bacterial genome and can take all bacterial genome fasta sequences into a single file by using below command

# download all bacterial genomes from RefSeq to folder "library"
centrifuge-download -o library -m -d "bacteria" refseq > seqid2taxid.map

# create concatenated fasta file
cat library/*/*.fna > input-sequences.fna

But could you please suggest me how I can download the corresponding taxonomy like below format for these genome sequences?

 GCF_000010525.1 Bacteria; Firmicutes; Bacilli; Lactobacillales; Enterococcaceae; Enterococcus   Enterococcus faecium DO
GCF_000007365.1  Bacteria; Proteobacteria; Gammaproteobacteria; Xanthomonadales; Xanthomonadaceae; Xylella   Xylella 

Many thanks, bioinfonext

dafin678 commented 12 months ago

hello sir, i always get this error

" Download failed! have a look at valid domains at "ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq . "

Could you help me please ?