DerrickWood / kraken2

The second version of the Kraken taxonomic sequence classification system
MIT License
718 stars 270 forks source link

NR Database compiling stalling #534

Open Yvain-Desplat opened 2 years ago

Yvain-Desplat commented 2 years ago

Hi,

I am trying to create a NR database, but I am running into issue in the building step.

I have run the following:

kraken2-build --download-taxonomy --protein --db nr (this one worked properly)

kraken2-build --protein --download-library nr --db nr (this one worked properly)

kraken2-build --build --threads 50 --protein --db nr (but here this gets hung up and stalls, see below)

Creating sequence ID to taxonomy ID map (step 1)... Found 515595989/766638514 targets, searched through 999540223 accession IDs, search complete. lookup_accession_numbers: 251042525/766638514 accession numbers remain unmapped, see unmapped.txt in DB directory Sequence ID to taxonomy ID map complete. [3h20m59.066s] Estimating required capacity (step 2)... Estimated hash table requirement: 94779988844 bytes Capacity estimation complete. [13m16.479s] Building database files (step 3)... Taxonomy parsed and converted. CHT created with 21 bits reserved for taxid. Processed 8530885 sequences (3369837587 aa)...

Any reasons why it's stalling like this ?

blaizereal commented 2 years ago

Hi, just the same issue :( waiting....

image

Bests: Blaize