epam / cloud-pipeline

Cloud agnostic genomics analysis, scientific computation and storage platform
https://cloud-pipeline.com
Apache License 2.0
145 stars 59 forks source link

Use Kubernetes executor for Nextflow pipelines instead of SGE #879

Open mike-miroliubov opened 4 years ago

mike-miroliubov commented 4 years ago

Background Just an idea: Currently SGE executor is being used for Nextflow pipelines. For this sake, an auto-scaling SGE cluster has to be deployed by cloud pipeline (over Kubernetes cluster). This works fine, however, introduces a redundant level of orchestration. Plus, SGE's scheduling, error codes and etc are far from perfect. On the other hand, Nextflow natively supports using Kubernetes as an executor and we can leverage this support to get rid of SGE in this case.

Approach Use Nextflow built-in Kubernetes support: https://www.nextflow.io/docs/edge/executor.html#kubernetes https://www.nextflow.io/docs/edge/kubernetes.html https://www.nextflow.io/docs/edge/config.html#scope-k8s and, probably, https://www.nextflow.io/docs/edge/kubernetes.html#running-in-a-pod

It should be possible to bound nodes for Nextflow cluster with namespace or node labels, using NF k8s config.

sidoruka commented 4 years ago

@k-i-t-e thanks for this proposal. We've considered Kube executor as an option in the initial implementation (See #291). But in the end, we've chosen to follow the strategy:

To my mind, implementation of the direct Kube executor support won't bring much of the value to the platform but requires a lot of engineering effort.

But definitely, there can be other valuable opinions. @mzueva @tcibinan any thoughts?