nf-core / tools

Python package with helper tools for the nf-core community.
https://nf-co.re
MIT License
242 stars 191 forks source link

Template: Add `gpu` profile #3272

Open mashehu opened 2 weeks ago

codecov[bot] commented 2 weeks ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 75.72%. Comparing base (9412504) to head (4c88397). Report is 13 commits behind head on dev.

Additional details and impacted files

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

nictru commented 2 weeks ago

So the flow of information in the scdownstream pipeline looks as following:

  1. Users provide the gpu profile
  2. The gpu profile prepares the containerization tools for GPU mounting and sets the hidden pipeline parameter use_gpu to true -> can be used for if-clauses in workflows
  3. Processes with GPU support get the process_gpu label -> can be used to handle all GPU-enabled processes in a certain way. Different executors need different tweaking to handle tasks from these processes correctly. I added a bit of documentation here.
  4. This section makes sure that processes have an ext variable which reflects if they should use GPU or not. This can be useful if the same module supports usage both with and without GPU. Example: cellbender

This implementation is the best I could come up with so far, but not many really "senior" people have had a look at it AFAIK. Maybe we can discuss which elements of the suggested approach should be part of the template and where we can perform some improvements.

nictru commented 2 weeks ago

Another thought: Should this not be an optional part of the template?

mashehu commented 2 weeks ago

it's teeny-tiny enough, that we might keep it in, imo, similar to arm... but also not 100% sure

nictru commented 2 weeks ago

But is the goal to only add the profile to the template?

I think it would at least be nice to have a common structure for all GPU-enabled modules, even if it only means adding the process_gpu label

GallVp commented 2 weeks ago

Thank you @nictru

  1. Users provide the gpu profile

The gpu profile being added to the template does not have the use_gpu configuration variable. I can see that you have added it to the profile in the pipeline: https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L182

I don't think we need a separate configuration variable.

  1. The gpu profile prepares the containerization tools for GPU mounting and sets the hidden pipeline parameter use_gpu to true -> can be used for if-clauses in workflows

In the scdownstream pipeline, use_gpu is defined as a pipeline parameter (https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L40) and a configuration variable (https://github.com/nf-core/scdownstream/blob/3231971f309d1ac025e7180b69852c3637c975dd/nextflow.config#L182). These two are different so setting gpu in the -profile, will not set the use_gpu pipeline parameter. It will only set the configuration variable.

I think we only need the use_gpu pipeline parameter and should get rid of the use_gpu configuration variable. The use_gpu label in the base.config file can be changed to,

process {
    withLabel: 'use_gpu' {
        ext.use_gpu = { params.use_gpu }
    }
}

At pipeline execution, the user must set the gpu profile and the use_gpu pipeline parameter to utilise the GPU-based tools.

  1. Processes with GPU support get the process_gpu label -> can be used to handle all GPU-enabled processes in a certain way. Different executors need different tweaking to handle tasks from these processes correctly. I added a bit of documentation here.

Thank you. This is very helpful.

  1. This section makes sure that processes have an ext variable which reflects if they should use GPU or not. This can be useful if the same module supports usage both with and without GPU. Example: cellbender

This is quite clever. Nice!

nictru commented 2 weeks ago

Hey @GallVp, thanks for the input.

Some notes:

It might sound stupid, but I did not know there was a difference between pipeline parameters and configuration variables. I agree that we should only have one.

I am not fully aware about what you can and can't do with configuration variables. I also was not able to find any documentation about them at all. I only used the configuration variable in scdownstream. Not sure if one can use configuration variables for workflow if clauses?

At pipeline execution, the user must set the gpu profile and the use_gpu pipeline parameter to utilise the GPU-based tools.

I would prefer if setting the gpu profile would be sufficient. Keeping them separate will probably lead to a lot of users using one without the other, which almost certainly will lead to errors. I am not aware of any use cases where this would be an advantage, but I'm eager to hear.

GallVp commented 1 week ago

Not sure if one can use configuration variables for workflow if clauses?

No, configuration variables are not available in the workflows. Maybe, there is a backdoor that I am not aware of.

How about we get rid of both the variable and the parameter? Instead, we modify the profile as,

gpu {
    process {
        withLabel: process_gpu {
            ext.use_gpu = true
        }
    }
    docker.runOptions = '-u $(id -u):$(id -g) --gpus all'
    apptainer.runOptions = '--nv'
    singularity.runOptions = '--nv'
}

The workflows can use workflow.profile.contains('gpu') in place of the use_gpu parameter.

The above solution, however, does not take care of the arm profile. So, we need to have a gpu_arm profile as well.

nictru commented 1 week ago

Hey, I just tested the suggested approach and it seems to work as expected. I think we could do it like this, but I would prefer having the withLabel block in the base.config - just to make sure that all the withLabel configs are collected in one place.

I did this using the following:

withLabel: process_gpu {
    ext.use_gpu = {workflow.profile.contains('gpu')}
}

So that we don't need a config variable. But this is more a cosmetic topic and not a hill-to-die-on.

mashehu commented 1 week ago

feel free to add the changes, @nictru

GallVp commented 1 week ago

I did this using the following:

withLabel: process_gpu {
    ext.use_gpu = { workflow.profile.contains('gpu') }
}

I agree. This is more in line with the existing infrastructure. Thank you!

ewels commented 1 day ago

Needs opt-in functionality in the nf-core pipelines create. See also #3261