DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
687 stars 266 forks source link

There is an error running "kraken2-build --download-taxonomy --db $DBNAME --threads 24". #753

Open KeyLllll opened 10 months ago

KeyLllll commented 10 months ago

I used this command to download taxonomy: kraken2-build --download-taxonomy --db $DBNAME --threads 24

and the error message showed like this: Downloading nucleotide gb accession to taxon map...rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41f:250::229): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::7): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (165.112.9.229): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (165.112.9.228): Network is unreachable (101) rsync error: error in socket IO (code 10) at clientserver.c(125) [Receiver=3.1.2]

Then, I added "--use-ftp" option, and used this command to download taxonomy: kraken2-build --download-taxonomy --db $DBNAME --threads 24 --use-ftp

and the error message showed like this: Downloading nucleotide gb accession to taxon map...

Is there any solution for this? Thank you so much!

SimonHegele commented 10 months ago

Hey you, I also had issues building a custom database. Apparently there have been stuctural changes to NCBI, that Kraken2 has not been adapted to. If that is the reason in your case you might find a solution in this thread https://github.com/DerrickWood/kraken2/issues/465. However the easiest thing would probably be to go for one of the pre-build databases which you can download here: https://benlangmead.github.io/aws-indexes/k2.

KeyLllll commented 10 months ago

@SimonHegele Hello, I downloaded the database through https://benlangmead.github.io/aws-indexes/k2. But when I ran "kraken2-build --build --threads 24 --db $DBNAME", I encountered an error: Can't find taxonomy/ subdirectory in database directory, exiting. Do you know how to solve it? Thanks!

SimonHegele commented 10 months ago

Hey, you don't need this command since the database you downloaded is already build, you can start classifying right away :)

KeyLllll commented 10 months ago

@SimonHegele Thank you very much! :)

vera-rykalina commented 8 months ago

Hi there! I have the same error (kraken2 v2.1.2) for kraken2-build --db hiv_krakendb --download-taxonomy:

Downloading nucleotide gb accession to taxon map...rsync: [Receiver] failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.11): Connection refused (111) rsync: [Receiver] failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.12): Connection refused (111) rsync: [Receiver] failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::12): Network is unreachable (101) rsync: [Receiver] failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::11): Network is unreachable (101) rsync error: error in socket IO (code 10) at clientserver.c(139) [Receiver=3.2.7]

Hope, it will be fixed soon!

SimonHegele commented 8 months ago

Guess you are interested in HIV-viruses. Download the prebuild Viral database or the Standard database from https://benlangmead.github.io/aws-indexes/k2

jenniferlu717 commented 6 months ago

Unfortunately rsync errors are more due to the NCBI network connectivity to your own server. Sometimes firewalls prevent users from downloading directly from NCBI but I would check out the prebuilt indexes or try again later