Open EricDeveaud opened 5 years ago
Hi Eric, thanks for trying KrakenUniq. Unfortunately KrakenUniq requires a lot of memory - ideally 64-256GB depending on the database. I will check the work-on-disk
parameter, which in principle should work with less memory, too.
Hi @fbreitwieser
I'm having a similar problem, while trying to build a database using 32 threads on a machine with 128 GB of RAM. It looks like Jellyfish is trying to allocate 350 GB of memory. Is this expected, even though I'm using the --work-on-disk
option?
Does the memory requirement also depend on the number of threads used, or only on the number of sequence files from the database?
Here are the commands that I'm running and the krakenuniq-build
error:
$krakenuniq-download --db non-fungal-contaminants --threads 32 --dust refseq/bacteria refseq/archaea refseq/protozoa refseq/vertebrate_mammalian/Chromosome/species_taxid=9606 refseq/viral/Any viral-neighbors
[...]
$krakenuniq-build --db non-fungal-contaminants --kmer-len 31 --threads 32 --taxids-for-genomes --taxids-for-sequences --work-on-disk
Found jellyfish v1.1.12
Kraken build set to minimize RAM usage.
Found 62084 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory.
Creating k-mer set (step 1 of 6)...
Using jellyfish
Hash size not specified, using '66895342245'
terminate called after throwing an instance of 'jellyfish::invertible_hash::ErrorAllocation'
what(): Failed to allocate 353414451816 bytes of memory
/export/home/ncit/external/a.mizeranschi/utils_conda/libexec/build_db.sh: line 46: 25057 Aborted (core dumped) jellyfish count -m 31 -s 66895342245 -C -t 32 -o database /dev/fd/63
xargs: cat: terminated by signal 13
Hello @fbreitwieser
I have been trying to figure out this error since over a month now. My cluster have no memory issues. I am using 10 nodes, 280 cores, and 1680gb of memory total
I am trying to build refseq/bacterial and custom viral database with krakenuniq-build
$krakenuniq-build --jellyfish-hash-siz 15M --threads 20 --db DATABASES/virbac5
I also tried work-on-disk option as well has hash size ranging from 200M to 2 M but it always ends up exiting citing the memory issues.
Found jellyfish v1.1.11 Kraken build set to minimize disk writes. Found 2 sequence files (*.{fna,fa,ffn,fasta,fsa}) in the library directory. Skipping step 1, k-mer set already exists. Skipping step 2, no database reduction requested. Sorting k-mer set (step 3 of 6)... db_sort: Getting database into memory ...db_sort: unable to mmap database.jdb: Cannot allocate memory
Execution terminated Exit_status=271 resources_used.cpupercent=566 resources_used.cput=21:25:11 resources_used.mem=153956244kb resources_used.ncpus=280 resources_used.vmem=210643704kb resources_used.walltime=04:36:49
Help would be highly appreciated.
Thanks Zaidi
Any Process?
Hello,
I just installed krakenuniq v0.5.3 and tried to build archea database but I have the following problem.
after downloading the taxo and refseq/archea data krakenuniq-build exit with error
dbsort is killed even with --work-on-disk option passed to krakenuniq-build command see
and in
/var/log/messages
we haveNB: running tests on a computer with 16G ram
I am reqested to install krakenuniq on our cluster and I'm wondering how I can achieve the DB cretion for our users ? can you provide some insights on how to deal with this problem
best regards
Eric