Open slambrechts opened 5 years ago
I have the same problem here...
(metawrap-env) bababaal@MEPHISTO:/DATA/metaWRAP_DATABASE/KRAKEN$ kraken-build --standard --threads 7 --db KRAKEN_DATABASE_2019-04-15 --work-on-disk
Found jellyfish v1.1.12
--2019-04-15 16:12:05-- ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_est.accession2taxid.gz
=> «nucl_est.accession2taxid.gz»
Résolution de ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)… 130.14.250.11, 2607:f220:41e:250::7
Connexion à ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.11|:21… connecté.
Ouverture de session en tant que anonymous… Session établie.
==> SYST ... terminé. ==> PWD ... terminé.
==> TYPE I ... terminé. ==> CWD (1) /pub/taxonomy/accession2taxid ... terminé.
==> SIZE nucl_est.accession2taxid.gz ... terminé.
==> PASV ... terminé. ==> RETR nucl_est.accession2taxid.gz ...
Fichier «nucl_est.accession2taxid.gz» inexistant.
I know where this come from: "The Nucleotide database will include EST and GSS sequences in early 2019." https://ncbiinsights.ncbi.nlm.nih.gov/2018/07/30/upcoming-changes-est-gss-databases/
NCBI guys required the files needed by KRAKEN...
I also found this information: https://github.com/DerrickWood/kraken2/issues/101
Looks like the files were changed and the automatic Kraken DB pull no longer works. Sounds like a miscommunication between NCBI and Kraken. See issue https://github.com/DerrickWood/kraken/issues/132. Looks like they are working on fixing this. This cannot be fixed in metaWRAP, so the only thing I can do is wait for a fix from the Kraken team. My advice would be to try out Kraken2 in the meantime...
Actually, looks like kraken2 is suffering from the same issue...
Kraken and Kraken2's download scripts should work now. Let me know if you run into anymore trouble!.
Thanks Jen!
Hi, I've ask the NCBI helpdesk about this issue. Here is their reply:
The EST and GSS sequences have been subsumed into our Nucleotide (GenBank in this case) database; please see the note here: https://www.ncbi.nlm.nih.gov/nuccore/ You should get the nucl_gb* files to have the EST and GSS records.
Best, Axel
Hey @jenniferlu717, I have some users asking what they should do to get this to work? Updating Kraken doesn`t seem to work.
It needs to be updated and reinstalled (rerun sh install_kraken)
I see. Any chance you could update the bioconda recipe? That is the version all these people use.
I actually am not the one that is in charge of the bioconda recipe but i will contact the person that can make that change.
Thank you!
Hi, this is still an issue. I have tried with both conda install, and installing using the git repo. kraken-build --standard --threads 24 --db std_kraken _db Found jellyfish v1.1.12 --2021-08-03 12:53:38-- ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_est.accession2taxid.gz => 'nucl_est.accession2taxid.gz' Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 165.112.9.230, 130.14.250.7, 2607:f220:41f:250::230, ... Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|165.112.9.230|:21... failed: Connection timed out. Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.7|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/taxonomy/accession2taxid ... done. ==> SIZE nucl_est.accession2taxid.gz ... done. ==> PASV ... done. ==> RETR nucl_est.accession2taxid.gz ... No such file 'nucl_est.accession2taxid.gz'.
Hi, I am also trying to update a Kraken database and ran kraken-build --download-taxonomy --db $DBNAME
and got a similar error.
--2022-08-03 15:28:00-- ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/nucl_est.accession2taxid.gz => ‘nucl_est.accession2taxid.gz’ Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.12, 130.14.250.11, 2607:f220:41e:250::12, ... Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.12|:21... connected. Logging in as anonymous ... Logged in! ==> SYST ... done. ==> PWD ... done. ==> TYPE I ... done. ==> CWD (1) /pub/taxonomy/accession2taxid ... done. ==> SIZE nucl_est.accession2taxid.gz ... done.
==> PASV ... done. ==> RETR nucl_est.accession2taxid.gz ... No such file ‘nucl_est.accession2taxid.gz’.
Hi Ursky,
I would like to point out that it currently seems impossible to build the kraken database. It seems the accession2taxid files (for example
nucl_est.accession2taxid.gz
) are not on the ncbi ftp server anymore, or are moved to a different location, as of a few days ago. See: Index of /pub/taxonomy/accession2taxidSo when I run:
kraken-build --download-taxonomy --threads 4 --db /mnt/e/Sam/KRAKEN/DATABASE/
I get:
Putting this out here in case somebody else experiences the same problem
Feel free to remove this issue if you think this doesn't belong here.
Or if you think there might be a solution, feel free to let us know :)