epi2me-labs / wf-basecalling

Other
34 stars 13 forks source link

Specify Dorado memory requirements #44

Closed mgperry closed 4 months ago

mgperry commented 4 months ago

Is your feature related to a problem?

Hi, I'm running the workflow on a workstation with 32GB RAM, but this is detected as 31.2GB. As a consequence, the workflow won't run as written, since 32GB is specificed as required in the dorado process (in lib/signal/ingress.nf). The workflow runs fine (on the test data) if I reduce the requirement by editing the source code to specify 16GB.

Describe the solution you'd like

If it was possible to supply a params.dorado_mem_limit or similar this would work around the issue. I guess this would be an 'at own risk' paramater, I don't know how much RAM Dorado uses on a bad day (this isn't mentioned in the Dorado docs on github), and if this could cause problems with schedulers etc.

Describe alternatives you've considered

As mentionned above, the workflow runs fine but has to be manually corrected in the source.

Additional context

This is also the case for the align_and_qsFilter process in the same file. Presumerably the high usage is driven by minimap2, reading through the github it looks like high RAM usage is pretty common, but this also runs fine on my machine.

SamStudio8 commented 4 months ago

@mgperry You can override the resources required for the Dorado process without a parameter by applying additional custom configuration with a process selector. You'll need to create a file say, dorado_mem.config with the contents:

process {
    withName:dorado {
        memory = "31 GB"
    }
}

You can instruct Nextflow to load this custom config with -c, for example: nextflow run -c dorado_mem.config ...

For more information see https://www.nextflow.io/docs/latest/config.html#process-selectors

mgperry commented 4 months ago

@SamStudio8 That's exactly what I needed, the workflow runs without problems. Thanks for replying so quickly, apologies I didn't realise this was possible using configuration, I'm still getting the hang of nextflow.