As mentioned in #86 , sometimes kraken2krona must be run on a full node since it might run out of memory. Being a single-core job, this wastes resources and causes longer queue times.
The issue seems to be that the filtering step krakenuniq2krona.R loads the full sequences.krakenuniq in memory, and from what I see this file can be several GB large. So I have rewritten the script to read and process the file line by line instead (in python because I am more confident with that)
I don't know if the following command (ktImportTaxonomy) also uses a lot of ram, and we can't do much about that, but the filtering step should be the most memory-intensive
As mentioned in #86 , sometimes
kraken2krona
must be run on a full node since it might run out of memory. Being a single-core job, this wastes resources and causes longer queue times.The issue seems to be that the filtering step
krakenuniq2krona.R
loads the fullsequences.krakenuniq
in memory, and from what I see this file can be several GB large. So I have rewritten the script to read and process the file line by line instead (in python because I am more confident with that)I don't know if the following command (
ktImportTaxonomy
) also uses a lot of ram, and we can't do much about that, but the filtering step should be the most memory-intensive