fbreitwieser / krakenuniq

🐙 KrakenUniq: Metagenomics classifier with unique k-mer counting for more specific results
GNU General Public License v3.0
218 stars 45 forks source link

classify: unable to mmap database.kdb: Cannot allocate memory #43

Open quetjaune opened 5 years ago

quetjaune commented 5 years ago

Hi! Thanks for krakenuniq! I am having memory issues when trying to classify. This is the error I get: classify: unable to mmap database.kdb: Cannot allocate memory I am running in a cluster with 4 nodes and 128 GB RAM each one, having SLURM as a workload manager. The database is NT (size 234 GB). Something strange is that I could successfully process one of my samples (fastq with 8 million sequences), but when trying with another one (with similar seqs number) always get the message showed above. Please, any help is very welcome! Thanks, Marcos

fbreitwieser commented 5 years ago

Hi @quetjaune , can you try with the latest version, v0.5.8? If it persists, can you send me the exact command line?

Thanks, Florian

jing0703 commented 5 years ago

Hi I got the same problem even after reduce the hash size: krakenuniq-build --db taxonomy --standard --jellyfish-hash-size 200M

I am using krakenuniq(https://github.com/fbreitwieser/krakenuniq) as 'nucl_est.accession2taxid.gz' is not available for kraken-build.

This is the error: Downloading assembly summary file for bacteria genomes, and filtering to assembly level Complete_Genome. WARNING: taxonomy/library/bacteria already exists - potentially overwriting files. Downloading bacteria genomes: 14061/14061 ... Found 14061 files. Skipped download of 14061 files that already existed. Downloading assembly summary file for viral genomes, and filtering to assembly level Any. WARNING: taxonomy/library/viral already exists - potentially overwriting files. Downloading viral genomes: 9330/9330 ... Found 9330 files. Skipped download of 9330 files that already existed. Downloading viral neighbors. taxonomy/taxonomy/nucl_gb.accession2taxid.gz check [1.79 GB] taxonomy/taxonomy/nucl_gb.accession2taxid.sorted check [3.69 GB] Reading names file ... Downloading 145834 sequences into taxonomy/library/viral/Neighbors. query_key=1&webenv=NCID_1_85049056_130.14.22.33_9001_1565739192_1146505336_0MetA0_S_MegaStore Downloading sequences 1 to 10000 of 145834 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 244 0 244 0 0 55 0 --:--:-- 0:00:04 --:--:-- 76 done Downloading sequences 10001 to 20000 of 145834 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 244 0 244 0 0 57 0 --:--:-- 0:00:04 --:--:-- 57 done Downloading sequences 30001 to 40000 of 145834 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 244 0 244 0 0 55 0 --:--:-- 0:00:04 --:--:-- 55 done Downloading sequences 70001 to 80000 of 145834 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 244 0 244 0 0 52 0 --:--:-- 0:00:04 --:--:-- 55 done Found jellyfish v1.1.12 Kraken build set to minimize disk writes. Found 63690 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Skipping step 1, k-mer set already exists. Skipping step 2, no database reduction requested. Sorting k-mer set (step 3 of 6)... db_sort: Getting database into memory ...db_sort: unable to mmap database.jdb: Cannot allocate memory

Any idea how to fix this? Thanks very much for your help! Jing