pod definition is not taken into account in withLabel sections

fpichon commented 2 years ago

Bug report

Expected behavior and actual behavior

I want to run Nextflow in a node with 2 CPUs and the different processes in different nodes depending on the number of CPUs needed. I thus define a pod with nodeSelector in k8s section and other pods with other selectors in different withLabel sections.

I would expect to obtain as many pods as processes running in selected nodes from the desired node pool thanks to nodeSelector. However, only the pod definition present k8s section is taken into account, not the ones in withLabel sections and processes never start.

Steps to reproduce the problem

I thus created labels in the config file to allow to easily set them:

executor {
    name   = 'k8s'
    queueSize = 5
}

k8s {
    storageClaimName = 'nextflow-pvc-raw'
    storageMountPath = '/data'
    pod = [ [volumeClaim: 'nextflow-pvc-ref', mountPath: '/ref' ], [volumeClaim: 'nextflow-pvc-analysis', mountPath: '/out' ], [ nodeSelector: 'nodepool=nextflow-2cpu-7go' ] ]
}

process {
    withLabel: cpu4 {
                      pod = [nodeSelector : 'nodepool=nextflow-4cpu-15go']
                      cpus = 4
                    }
    withLabel: cpu16 {
                      pod = [nodeSelector : 'nodepool=nextflow-16cpu-60go']
                      cpus = 16
                    }
}

For pod definition, I followed examples from here: https://www.nextflow.io/docs/latest/process.html?highlight=pod#pod and here: https://gitter.im/nextflow-io/nextflow?at=5eea0bec539e566fc93e9977

Program output

Using kubectl describe pod mypod, I obtained the output:

...
    Limits:
      cpu:  4
    Requests:
      cpu:        4
...
Volumes:
  vol-7:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nextflow-pvc-raw
    ReadOnly:   false
  vol-8:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nextflow-pvc-ref
    ReadOnly:   false
  vol-9:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  nextflow-pvc-analysis
    ReadOnly:   false
  default-token-5ndwm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-5ndwm
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  nodepool=nextflow-2cpu-7go
...

The last line shows that the nodeSelector is the one from the k8s section, not from withLabel section. The CPUs number however is correctly set (4), but since the node set in k8s only contains 2 CPUs, the process never start. I marked this as a bug, but maybe I am doing something wrong (syntax?) or forgot something somewhere ?

Environment

Nextflow version: 21.10.6.5660
Java version: 1.8.0_322
Operating system: Debian GNU/Linux 9 (stretch)
Bash version: 4.4.12(1)-release (x86_64-pc-linux-gnu)

Additional context

Pipeline is run from a pod with the following command:

~/nextflow kuberun -profile cloud -hub mygit username/nextflow/test_cloud.nf -r main --fastq_dir /data/path/to/Fastq/ --out_dir /out/output_folder/

And an example of Nextflow process definition using label:

process CUTADAPT {
    container "cutadapt:v2.10"
    label 'cpu4'
    publishDir "${params.out_dir}/${pair_ID}/cutadapt/", mode: "copy", overwrite: "false"

    input:
      tuple val(pair_ID), file(reads)

    output:
      tuple val(pair_ID), file("*_R{1,2}_trimmed.fastq.gz"), emit: trimmed
      tuple val(pair_ID), file("${pair_ID}_cutadapt_report.txt"), emit: report

    """
      cutadapt --cores=${task.cpus} --minimum-length=${params.minlen} --quality-cutoff=${params.quality_cutoff_5},${params.quality_cutoff_3} -a file:/ref/adapters/adaptersRead1.fa -A file:/ref/adapters/adaptersRead2.fa -o ${pair_ID}_R1_trimmed.fastq.gz -p ${pair_ID}_R2_trimmed.fastq.gz ${reads} > ${pair_ID}_cutadapt_report.txt
    """
}

I tried many different configurations (and syntaxes), but never succeed that the pod definition in withLabel section was taken into account... Thanks for your help.

bentsherman commented 2 years ago

The pod directive is a little wonky because it is both a k8s config setting and a process directive. I'll need to check how the options are resolved when provided via both ways.

In the meantime, can you try putting all of your pod options under process like this:

executor {
    name   = 'k8s'
    queueSize = 5
}

k8s {
    storageClaimName = 'nextflow-pvc-raw'
    storageMountPath = '/data'
}

process {
    pod = [ [volumeClaim: 'nextflow-pvc-ref', mountPath: '/ref' ], [volumeClaim: 'nextflow-pvc-analysis', mountPath: '/out' ], [ nodeSelector: 'nodepool=nextflow-2cpu-7go' ] ]

    withLabel: cpu4 {
                      pod = [nodeSelector : 'nodepool=nextflow-4cpu-15go']
                      cpus = 4
                    }
    withLabel: cpu16 {
                      pod = [nodeSelector : 'nodepool=nextflow-16cpu-60go']
                      cpus = 16
                    }
}

fpichon commented 2 years ago

Hi @bentsherman, thanks for your answer.

When applying your config, three things happen:

the node is correctly attributed according to nodeSelector to the processes but, after 2 days, it never launched,
the volume claims are not taken into account (resolved by copy/paste the claims in pod definition of each label),
nextflow is run in a random node, since pod is not specified in k8s section.

The pod definition in process thus seems to be ignored or overwritten, which could be logical since I do not have any process without a label in my test. And it seems that pod defined in withLabel are never launched.... don't know why.

bentsherman commented 2 years ago

Hi @fpichon, I finally have some more time to investigate these k8s issues.

So the use of k8s.pod and process.pod looks good as far as I can tell. The process-level nodeSelector will simply overwrite the k8s-level nodeSelector.

I'm thinking there might be something wrong with how pod within a withLabel is applied. What happens if you remove the withLabel for a moment?

k8s {
    storageClaimName = 'nextflow-pvc-raw'
    storageMountPath = '/data'
    pod = [ [volumeClaim: 'nextflow-pvc-ref', mountPath: '/ref' ], [volumeClaim: 'nextflow-pvc-analysis', mountPath: '/out' ], [ nodeSelector: 'nodepool=nextflow-2cpu-7go' ] ]
}

process {
    pod = [nodeSelector : 'nodepool=nextflow-4cpu-15go']
    cpus = 4
}

bentsherman commented 2 years ago

I was unable to reproduce this error with a minimal example :/

I ran nextflow -c kuberun.config kuberun bentsherman/hello

with this config:

k8s {
    pod = [
        [nodeSelector: 'kubernetes.io/os=linux']
    ]
}

process {

    withLabel: hello {
        tag = { x }
        pod = [nodeSelector: 'kubernetes.io/os=foobar']
    }
}

In my example, the head pod runs but the worker pods get stuck because they have an invalid node selector.

fpichon commented 2 years ago

Hi @bentsherman,

Thanks for having taken the time to investigate the problem. We are creating a new Kubernetes cloud, so I will have a new try next week. I will come back to you with the results as soon as possible.

fpichon commented 2 years ago

Hi @bentsherman, Sorry for the long delay. Since we have the new kubernetes cluster, it seems now to work fine. Thus, I do not really know why it was not working before, but its was more likely due to the kubernetes cluster rather than Nextflow. Thank you very much for your support on this issue! :)

nextflow-io / nextflow