Closed roddypr closed 7 months ago
Hi Roddy,
it looks like you've managed to fix the issue on your own.
Indeed, there is one global memory resource setting defined as __default__
, and then rule-specific memory declarations .
The rule-specific declarations 'override' the default setting for the listed rules. All other rules are consuming the default memory amount.
If a rule fails due to hitting a memory limit, you can add a declaration for that rule to your cluster_config.yaml and assign a value.
I'd recommend to copy the cluster_config.yaml from your output directory and modify the copy, the submit it to snakePipes workflow with --clusterConfigFile
.
Increasing the default memory as you did will also work, but you will end up reserving more resources for all the rules that are using the 'default' memory value on your cluster environment.
To be honest, this job state report is a bit confusing: there is an out_of_memory state listed, but exit code is 0? Also the reported memory usage looks like well under the 1Gb assigned to it. But if you're saying that assigning 6Gb as default fixed it, then it might indeed have been the issue.
Hope this helps, Best wishes,
Katarzyna
Dear Katarzyna,
Thank you for your very clear explanation. It helps a lot because I would like to configure snakePipes to be used across different people in my department.
Just to make sure I understand, in my case I should add the following to a local cluster config file (running with --clusterConfigFile
):
plotCorr_bed_spearman:
memory: 6G
The job state report is indeed a bit confusing. My colleagues mentioned that sometimes there are peaks of memory use that are not accurately measured by seff, which could explain the problem (in fact I will try running the pipeline with 2G for this step, which is probably enough).
Best wishes,
Roddy
Dear Roddy,
I'd have thought that you might be facing a one-off issue with a particularly large dataset e.g. a very high number of samples. In that case, rerunning the workflow with a custom cluster config file would be good enough.
If you think this is rather going to be a recurrent issue, you might consider configuring your snakePipes installation with snakePipes config --clusterConfig
and passing a custom "shared" cluster config file. You can see the default "shared" cluster config here.
For the rule plotCorr_bed_spearman
, it should be alright to put it in this "shared" cluster config. A couple of other deepToools-based rules are also defined there.
The cluster config that you see in your workflow output folder is a merge between the "shared" and the workflow-specific cluster configs. The merge is done at runtime.
Let me know if you have any other questions,
Best wishes,
Katarzyna
Hi
I am using snakePipes for the first time and I am a bit lost. The mRNAseq pipeline kept on failing at the plotCorr_bed_spearman rule. I've pasted some of the error messages below, but to summarise, this step was running out of memory. I managed to fix it by changing the default memory in cluster.yaml from:
to
Is this the correct thing to do?
I could not find a per-rule memory setting for plotCorr_bed_spearman anywhere. The following lines in cluster.yaml seem not to affect the memory of this specific step:
Sorry if this is not really an issue but just me misunderstanding how to configure snakePipes. What should I be doing to configure this step correctly?
Thank you so much for your help.
Best wishes,
Roddy
Error in log file:
The rule was running out of memory (probably a very fast spike in memory use (?), as the Memory Efficiency is quite low):