Closed moudixtc closed 7 years ago
The canu job is running out of memory, usually h_vmem is not the correct parameter because it is not consumable and also isn't scaled by threads. Since canu divides the memory by the threads requested it would be under-requesting memory it needs. You can check your configuration using the qconf
command:
#name shortcut type relop requestable consumable default urgency
#----------------------------------------------------------------------------------------
h_vmem h_vmem MEMORY <= YES NO 0 0
mem_free mf MEMORY <= YES YES 0 0
If you have mem_free (or another requestable/consumable) memory option I would use that for MEMORY instead of h_vmem.
Thank you for the quick reply. It seems like it doesn't have any consumable memory though...
ubuntu@ip-10-30-3-219:/shared/F_vert2_auto$ qconf -sc | grep MEMORY
h_core h_core MEMORY <= YES NO 0 0
h_data h_data MEMORY <= YES NO 0 0
h_fsize h_fsize MEMORY <= YES NO 0 0
h_rss h_rss MEMORY <= YES NO 0 0
h_stack h_stack MEMORY <= YES NO 0 0
h_vmem h_vmem MEMORY <= YES NO 0 0
mem_free mf MEMORY <= YES NO 0 0
mem_total mt MEMORY <= YES NO 0 0
mem_used mu MEMORY >= YES NO 0 0
s_core s_core MEMORY <= YES NO 0 0
s_data s_data MEMORY <= YES NO 0 0
s_fsize s_fsize MEMORY <= YES NO 0 0
s_rss s_rss MEMORY <= YES NO 0 0
s_stack s_stack MEMORY <= YES NO 0 0
s_vmem s_vmem MEMORY <= YES NO 0 0
swap_free sf MEMORY <= YES NO 0 0
swap_rate sr MEMORY >= YES NO 0 0
swap_rsvd srsv MEMORY >= YES NO 0 0
swap_total st MEMORY <= YES NO 0 0
swap_used su MEMORY >= YES NO 0 0
virtual_free vf MEMORY <= YES NO 0 0
virtual_total vt MEMORY <= YES NO 0 0
virtual_used vu MEMORY >= YES NO 0 0
Looking through the cfncluster docs, they support slurm. Canu works quite well with slurm. Can you use that?
I'm not at all familiar with cfncluster. A little searching hints that some people are using 'post_install' to further tune the SGE configuration. It's fairly easy to add memory tracking, but I'd need to dig out my notes to remember how.
The final option is to configure canu to change the minimum memory needed for specific components. The problem seems to be jobs getting scheduled on the smaller node, so merylMemory=16g
would prevent this. I'd also suggest ovlThreads=4 to keep overlapper off that node too.
Or, I suspect just getting rid of that smaller node would solve the problem too.
So I tried using `gridEngineMemoryOption="-l mem_free=MEMORY" instead, and it got through the initial steps but again failed somewhere due to an out of memory issue. Then I switched to use slurm, and it worked out of the box. Thank you for the help!
Hi, I just started using canu, and I'm sorry if this is something obvious, but please help me understand what went wrong.
I'm running canu v1.5 on a grid setup using SGE, which is bootstrapped by cfncluster on AWS.
Got the following error when running the command
canu -p F_vert2 -d F_vert2_auto genomeSize=50m -pacbio-raw /shared/filtered_subreads.fasta gridEngineMemoryOption="-l h_vmem=MEMORY" gridEngineThreadsOption="-pe make THREADS"
Here is the output of
correction/0-mercounts/F_vert2.ms16.histogram.info
:Found some errors in
correction/0-mercounts/meryl.1.out
Also some errors in
correction/0-mercounts/F_vert2.ms16.estMerThresh.err