Closed Kxpark closed 1 year ago
Hi @Kxpark,
This seems to be a memory issue. You are mostly likely giving this step more memory than you actually have. You might want to reconsider the number of threads and the memory you are using. kmc
shouldn't need too much memory. Are you running this in a cluster environment or in a local computer?
Best, Kivanc
I am running on a local computer. I thought I was being conservative with my allocation of cores, but I can try it even lower.
Even with 1 core, it still seems to have an issue. I have plenty of RAM so that shouldnt be the limit for this step.
You can also specify how much memory you want kmc
to use by using the -m
parameter. You can do that by adding this parameter in kmc extra params section of the config file:
@Kxpark Can you also share the content of one of the kmc_canon.log
files to see if there is more information there?
let me know if I am doing it wrong,
I appreciate the help
here is one, however most of them are empty.
K-Mer Counter (KMC) ver. 3.1.1 (2019-05-19)
Usage:
kmc [options]
kmc
expects the -m
parameter to be in Gb so you need to specify Gb with the -m
parameter:
-m - max amount of RAM in GB (from 1 to 1024); default: 12
I see, after changing it to "-m 2" I got this output
Can you share the content of logs/count_kmers/kmc/Syn276/kmc_canon.log
?
K-Mer Counter (KMC) ver. 3.1.1 (2019-05-19)
Usage:
kmc [options]
Can you try extra: "-m2"
instead of "-m 2"
?
Here I tried "-m1" and I am running it single core
I am continuing to get the same "Segmentation fault" even if I increase my cores to 8 and leave the "-m2" as is.
I am open to creative ideas if you have any.
I should have plenty of resources for this step.
kmc
will most likely need more than "-m1" depends on your data. How big is your data? Below is an example from 'kmc' developers to give you an idea about how much memory you might need:
In our experiments, KMC was able to count 28-mers in human genome sequencing data of gzipped size 614GB (>736 Gbases) in only 33GB of RAM.
https://github.com/refresh-bio/KMC/issues/174#issuecomment-962220455
does it look at each individual alone or the whole population, either way each individual is a few mb and the whole population of raw reads is maybe 120gb
I even increased "-m20" and am still getting similar results. I have 21gb of free ram so it shouldnt be a problem.
I FIXED IT!
so in the /workflow/envs/kmc.yaml
I changed the dependency from kmc 3.1.1 to 3.2.1 the most recent update and it is running now.
Thank you for the help troubleshooting
I'm glad it is running now. It is weird that I have never had problem with kmc 3.1.1 before. I will look into this and maybe change the default version to 3.2.1. Thanks again for reporting.
Best, Kivanc
Howdy, I was able to have the pipeline run all of the quality control, but when it gets to kmer counting it fails. I am using the new version that you released yesterday V1.1
Thank you for the help!