marbl / verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
304 stars 29 forks source link

Computing resource utilization is very low #285

Closed sh0rt2l0ng closed 1 month ago

sh0rt2l0ng commented 2 months ago

Hello Verkko team, we are recently trying to test the nanopore-only T2T workflow with verkko on a r5.16xlarge instance. But from our observation, verkko persistently has a low utilization of the available computing resources for each task with snakemake. May we know if there is a recommended way to boost the performance via any parameters or configuration? Any guideline for this would be appreciated. Thanks! image

sh0rt2l0ng commented 2 months ago

Hi Verkko team, I am not sure if there is any recommendation on how to optimize the cores used by time-consuming steps so that the assembly can be accelerated. I tried to increase those 'cpus' numbers in verrko script to maximum local cpus and 'mem' to 0 for all steps and but got segment dump.

skoren commented 2 months ago

Verkko defaults to 64gb on a local node and not all the cores so you want to set those with --local-memory and --local-cpus to whatever is available. Snakemake will then schedule the right number of jobs for each parallel task to utilize those. There will still never be perfect parallelization, there are single-threaded steps in the computation (for the utgcns the alignment of reads is parallel but the final consensus has to be called by a single thread). You don't want to set memory to 0 for all steps as you'd overwrite the detected appropriate limits and potentially overload the system and lead to crashes.

sh0rt2l0ng commented 2 months ago

Thanks a lot! We did set --local-memory and --local-cpus according to available resources. But yes, as you said, the utilization is still not optimal. And it seems we are entering the final consensus step. Really hope we can optimize the performance further in future. image

skoren commented 1 month ago

Idle, improved detection of local resources.

tfogler commented 3 weeks ago

Hi MarBL, I have been experiencing similar issues with my run. My command is nohup verkko -d asm --hifi *.fastq.gz --nano *.fq.gz --threads 32 --sto-run 32 128 24 --ovb-run 32 128 24 ...... --fhc-run 32 128 24 &; basically all steps set to maximum local cpus and maximum memory, but snakemake would only assign 1 cpu to each step. What's more, it failed in the computeErrors step. I tried to increase the resources used but it would either have the same behavior or fail.

Screenshots: Compute Errors Logfile image

StdOut [n_cpus is 32] image

image