Closed alvizain786 closed 10 months ago
Hi.
When run on a grid, both canu and the grid need to know memory and cpu limits. There is no explicit link between the two - for example, you can submit a job to the grid requesting 6 CPUs (via `-pe $node.name 6') but then run the command with fewer or more compute threads - the grid has no way of enforcing that the command use 6 CPUs.
In your log, canu has decided to use 4 CPUs and between 4 and 16 GB memory for each job (based on genome size and available hosts in the grid). However, by using (the rather low-level option) GridEngineResourceOption
you've explicitly told the grid that your jobs will use 6 CPUs and need 60 GB free memory. With the default value of GridEngineResourceOption
, canu would itself fill in the resources required for each job. And so, the way to increase the number of CPUs is, for example, ovlThreads=8
, to request that the overlap jobs use 8 compute threads.
Read through https://canu.readthedocs.io/en/latest/tutorial.html, the second/third section discusses this.
It is also possible to adjust the job sizes to get more jobs (instead of just using more CPUs for each job). Overlaps are usually the slowest step, and the primary option for fiddling with overlap job sizes is ovlRefBlockSize
.
Unable to run job: denied: host "actual.local.host.name.and.number" is no submit host.
This is an SGE config issue. Canu requires that execution hosts be able to submit jobs to the grid. Here's a link that should help: https://docs.oracle.com/cd/E19957-01/820-0698/eqqis/index.html
For microbial on AWS, I'd just grab a medium size node -- 8-16 CPUs 16-24GB memory, not that much disk -- and run Canu as a single job. For eukaryotic though, you'll most likely need to setup an SGE or Slurm cluster on AWS. Sadly, I don't know how to do that; searching for 'slurm aws' gave this link: https://docs.aws.amazon.com/parallelcluster/latest/ug/slurm-workload-manager-v3.html
Hi Brian,
Thank you for the insight. I appreciate the help. Would something like this overide all the settings to use specific number of cores and threads?
gridOptions <string=unset>
useGrid=true GridEngineResourceOption=-pe $node.name 6 -l mem_free=60G gridEngineArrayOption=-t ARRAY_JOBS -tc 6 gridOptions -pe $node.name 6 -l mem_free=60G -tc 6
This is based on:
https://canu.readthedocs.io/en/latest/parameter-reference.html
Thank you for SGE information. I will take a deeper look into it and hopefully it will work.
I am more familiar with SGE than SLURM, but I will take a look into as well. Hopefully I can get something set up.
No, the resource options aren't connected to memory/threads actually used, as @brianwalenz said above. I don't see the difference between the initial command and what you posted above. Generally, you don't want to overwrite any of the grid request options and let Canu handle it for you.
You can set
Idle, answered.
Hi All,
I would like to thank the team for the previous help that I received my other question concerning duplications. That really helped a lot.
I have unrelated question concerning Canu. I am using Canu on SGE where I have two distinct nodes.
There is one node that has one host with plenty of cores, but is has lower speeds with more RAM per core. I am about assemble genomes without a problem on this host. But it is taking a long time when I have to do multiple de-novo assemblies i.e. over 2 months for some Pacbio 2 Hifi data sets [3 GB (60X theoretical coverage with slightly more resources) and 1.3 GB (25X theoretical coverage) with less resources].
I have another node that has 15 hosts with more CPU speed with slightly less RAM. This is where things crash all the time.
My first question:
When I give Canu 6 threads with plenty of RAM. I still get something like this for Grid resources. How can I make more use of more CPUs and the RAM for each step? Is there a command that I should add? How can I optimize Canu to use more resources and complete the de-novo assemblies faster?
Command:
useGrid=true GridEngineResourceOption=-pe $node.name 6 -l mem_free=60G gridEngineArrayOption=-t ARRAY_JOBS -tc 6
-- Grid: meryl 12.000 GB 4 CPUs (k-mer counting)
-- Grid: hap 8.000 GB 4 CPUs (read-to-haplotype assignment)
-- Grid: cormhap 6.000 GB 4 CPUs (overlap detection with mhap)
-- Grid: obtovl 4.000 GB 4 CPUs (overlap detection)
-- Grid: utgovl 4.000 GB 4 CPUs (overlap detection)
-- Grid: cor -.--- GB 4 CPUs (read correction)
-- Grid: ovb 4.000 GB 1 CPU (overlap store bucketizer)
-- Grid: ovs 8.000 GB 1 CPU (overlap store sorting)
-- Grid: red 16.000 GB 4 CPUs (read error detection)
-- Grid: oea 8.000 GB 1 CPU (overlap error adjustment)
-- Grid: bat 16.000 GB 4 CPUs (contig construction with bogart)
-- Grid: cns -.--- GB 4 CPUs (consensus)
Question 2:
When I try the node with multiple hosts, then it always crashes at the meryl step. Any thoughts as what I can do make it work?
Error:
Also if I would like to use Canu on an AWS server. What is the best strategy to go about that for eukaryotic and microbial assemblies? How much resources should we provide from your experience? Additionally, what settings should we use?
Thank you in advance for the help.