DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
687 stars 267 forks source link

kraken2-build --build (self-defined) how much time??? #706

Open GaoXY99 opened 1 year ago

GaoXY99 commented 1 year ago

Hello everyone.

I tried to establish the customized kraken2 database (archaea, bacteria, and fungi), the "kraken2-build --build" step is stagnating for days.

my commands as follow:

kraken2-build --download-taxonomy --db ./kraken2DB
kraken2-build --download-library archaea --threads 30 --db ./kraken2DB
kraken2-build --download-library bacteria --threads 30 --db ./kraken2DB kraken2-build --download-library fungi --threads 30 --db ./kraken2DB

These steps were preformed correctly and the expected files were generated.

kraken2-build --build --threads 30 --db ./kraken2DB

However, This step already run over five days. In the kraken2DB folder, it only generated "seqid2taxid.map" and "taxo.k2d.tmp" files 5 days before, and no more update since then.

I wonder if something is wrong. How should I speed up "kraken2-build --build". Thank you very much!

Iris9310 commented 1 year ago

Have it finished?How long?

hkaspersen commented 10 months ago

Hello! I am having the same issue when trying to build the full kraken2 database after downloading all the libraries. I gave the job 200 hours, 24 CPUs and 800G of memory (which was more than what it required), but it still ran out of time. Is it getting stuck somewhere? This is the output before it stops:

Creating sequence ID to taxonomy ID map (step 1)...
Found 101641323/101953351 targets, searched through 976064068 accession IDs, search complete.
lookup_accession_numbers: 312028/101953351 accession numbers remain unmapped, see unmapped.txt in DB directory
Sequence ID to taxonomy ID map complete. [16m10.528s]
Estimating required capacity (step 2)...
Estimated hash table requirement: 744150203244 bytes
Capacity estimation complete. [17h52m19.763s]
Building database files (step 3)...
Taxonomy parsed and converted.
CHT created with 22 bits reserved for taxid.

I am using Kraken version 2.1.2