DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
687 stars 266 forks source link

error when downloading the kraken2 nt database #795

Open B1991ing opened 5 months ago

B1991ing commented 5 months ago

Dear kraken2 development team,

Do you have any idea to help us to manage to download the kraken nt database ?

When downloading the kraken2 nt database, it showed different errors, as shown in the screenshots.

Connection reset by peer (104) image

Network is unreachable image

Best,

Bing

nicolo-tellini commented 5 months ago

Hello @B1991ing ,

you can try the option --use-ftp to download it. Nevertheless, in the past days I had issues in downloading the bacteria libraries in this way for which I had to download them separately.

best

nic

B1991ing commented 5 months ago

Hello @B1991ing ,

you can try the option --use-ftp to download it. Nevertheless, in the past days I had issues in downloading the bacteria libraries in this way for which I had to download them separately.

best

nic

Hi Nic,

Thank you very much for your suggestion. The HPC administer tried the parameter --use-ftp. But, after running two hours, it shows us screenshot below without any result and error. Do you have any idea?

E8D231C95C60E1E023C3013D02BB3760

Best,

Bing

nicolo-tellini commented 5 months ago

Hello @B1991ing ,

No error, it is probably running. You can check if kraken runs with htopor top. Downloading the nt database is a time-consuming task, and its completion time varies based on multiple factors, including your internet connection, NCBI server responsiveness, your cluster or node's performance, and the number of threads employed. As per @ckeeling's report in #615 a couple of years ago, the download took approximately 8 hours using 8GB of memory and a single CPU core. Today, it might take even longer.

best

nic

B1991ing commented 5 months ago

Hello @B1991ing ,

No error, it is probably running. You can check if kraken runs with htopor top. Downloading the nt database is a time-consuming task, and its completion time varies based on multiple factors, including your internet connection, NCBI server responsiveness, your cluster or node's performance, and the number of threads employed. As per @ckeeling's report in #615 a couple of years ago, the download took approximately 8 hours using 8GB of memory and a single CPU core. Today, it might take even longer.

best

nic

Thank you very much, nic. Downloading the kraken2 nt db is really hard for us now. So, I asked our HPC admin to firstly help us download the kraken2 bacteria and archaea db, to see what will happen.

Best,

Bing