DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
699 stars 270 forks source link

Cannot download taxonomy or build database #417

Open theo-allnutt-bioinformatics opened 3 years ago

theo-allnutt-bioinformatics commented 3 years ago

The taxonomy download will not complete - no files downloaded and error message: kraken2-build --download-taxonomy --db bact --use-ftp also tried: kraken2-build --download-taxonomy --db bact The bact/taxonomy is empty

If I try to download the genomes: kraken2-build --download-library bacteria --db bact I get: rsync: failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.11): Connection timed out (110) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::7): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::11): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::12): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::10): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::13): Network is unreachable (101) rsync error: error in socket IO (code 10) at clientserver.c(125) [Receiver=3.1.2] Error downloading assembly summary file for bacteria, exiting.

If I try the ftp option: kraken2-build --download-library bacteria --db bact --use-ftp I get an error for all the genomes: No such file or directory Processed 23253/23262 projects (0 sequence, 0 bp)...gzip: all/GCF_004804395.1_ASM480439v1_genomic.fna.gz: No such file or directory

I have checked and I can connect by ftp or wget to ftp.ncbi.nlm.nih.gov with no problem, it just seems to be the kraken2 script that has an error.

Thanks,

Theo

alirizaaribas-ibg commented 3 years ago

The taxonomy download will not complete - no files downloaded and error message: kraken2-build --download-taxonomy --db bact --use-ftp also tried: kraken2-build --download-taxonomy --db bact The bact/taxonomy is empty

If I try to download the genomes: kraken2-build --download-library bacteria --db bact I get: rsync: failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.11): Connection timed out (110) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::7): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::11): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::12): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::10): Network is unreachable (101) rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::13): Network is unreachable (101) rsync error: error in socket IO (code 10) at clientserver.c(125) [Receiver=3.1.2] Error downloading assembly summary file for bacteria, exiting.

If I try the ftp option: kraken2-build --download-library bacteria --db bact --use-ftp I get an error for all the genomes: No such file or directory Processed 23253/23262 projects (0 sequence, 0 bp)...gzip: all/GCF_004804395.1_ASM480439v1_genomic.fna.gz: No such file or directory

I have checked and I can connect by ftp or wget to ftp.ncbi.nlm.nih.gov with no problem, it just seems to be the kraken2 script that has an error.

Thanks,

Theo

I also tried these but db is partially built.

/cm/shared/apps/kraken2/kraken2-build --standard --threads 16 --db /archive/db/kraken2db/maindb
/cm/shared/apps/kraken2/kraken2-build --standard --threads 16 --db /archive/db/kraken2db/maindb --use-ftp
/cm/shared/apps/kraken2/kraken2-build --standard --threads 16 --db /archive/db/kraken2db/maindb --use-ftp --no-masking
erinyoung commented 3 years ago

I am having the same issue

$ kraken2 --version
Kraken version 2.1.1
Copyright 2013-2020, Derrick Wood (dwood@cs.jhu.edu)

I want to create a custom db. So I first downloaded the taxonomy stuff, which only worked with --use-ftp

kraken2-build --download-taxonomy --db human_sarscov2 --use-ftp

When it came down to download the human database, however, I kept getting this error:

$ kraken2-build --download-library human --db human_sarscov2 --use-ftp
Error downloading assembly summary file for human, exiting.

I tried using rsync again, but that didn't work.

$ kraken2-build --download-library human --threads 10 --db human_sarscov2
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.11): Connection timed out (110)
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (130.14.250.7): Connection timed out (110)
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::13): Network is unreachable (101)
rsync: failed to connect to ftp.ncbi.nlm.nih.gov (2607:f220:41e:250::12): Network is unreachable (101)
rsync error: error in socket IO (code 10) at clientserver.c(125) [Receiver=3.1.2]
Error downloading assembly summary file for human, exiting.

I saw in other posts that a --use-wget flag used to be available, but I am unable to find that.

pengtb commented 3 years ago

You could try with --use-ftp flag.

erinyoung commented 3 years ago

With the --use-wget flag? That doesn't seem to be an option.

SergeyBaikal commented 2 years ago

Databases -standard will not be downloaded. After starting, I see again an empty command line.

sergey@debian:~/KRAKEN2/KRAKEN2_DIR$ kraken2-build --standard --threads 4 --use-ftp --db /media/sf_Shared_Folder/Database/
Downloading nucleotide gb accession to taxon map...sergey@debian:~/KRAKEN2/KRAKENsergey@debisergey@desersessergey@debian:~/KRAKEN2/KRAKEN2_DIR$