Provide a mechanism to place Nextflow pods on specific (k8s) nodes

At the moment, apart from namespace, we're unable to control where Pods are scheduled by Nextflow. Nextflow does offer a pod declaration in the process that is documented as being available in the k8s process declaration. Importantly the pod allows for the specification of a nodeSelector (the most basic part of scheduling), where Pods will only run on nodes with a given label. So we could have nextflow nodes and ensure that Pods run on those nodes.

Sadly the nodeSelector doesn't allow complex scheduling or support taints/tolerations to create exclusive nodes but it might be neough to provide better Pod scheduling - i.e. where nextflow processes only in on named nodes.

Basic idea

The basic idea is to expose a SQUONK_POD_NEXTFLOW_NODE_SELECTOR environment variable (default being application - our default node label) with the value passed to the Kubernetes Pod via the addExtraNextflowConfig() method call in in the OpenShiftRunner.

See

https://www.nextflow.io/docs/latest/kubernetes.html#k8s-page
https://www.nextflow.io/docs/latest/process.html#process-pod

InformaticsMatters / squonk

Provide a mechanism to place Nextflow pods on specific (k8s) nodes #132

Basic idea

See