reanahub / reana

REANA: Reusable research data analysis platform
https://docs.reana.io
MIT License
123 stars 54 forks source link

build(helm): Make Kueue configurable to REANA admins at the deployment time #801

Open xaviertintin opened 4 months ago

xaviertintin commented 4 months ago

Helm-Controlled Deployment Choice:

Make the behaviour configurable to REANA admins at the deployment time, so that each admin can choose whether to use the classical approach or the new optional Kueue approach, via Helm values.

Recap:

  1. Add kueue.enabled Value: Introduce a new value named kueue.enabled within the values.yaml file of the Helm chart.
  2. Pass Value as Environment Variable (reana-workflow-controller): In the helm/reana/templates/reana-workflow-controller.yaml template, configure the kueue.enabled value as an environment variable and pass it to the reana-workflow-controller container.
  3. Read Environment Variable (reana-workflow-controller): As suggested by Giuseppe, modify config.py for the reana-workflow-controller to access and utilize this environment variable.
  4. Pass Environment Variable (reana-job-controller): Within reana_workflow_controller/workflow_run_manager.py, modify the Kubernetes specification for the dynamically created reana-job-controller to include the environment variable containing the kueue.enabled value.

By following these steps and leveraging Helm values for configuration, REANA admins gain control over the job submission method during deployment. Additionally, the environment variable approach ensures proper communication between the reana-workflow-controller and the dynamically created reana-job-controller.

Kueue Integration Milestone 2 Discussion here

Automated Kueue Deployment with Customization:

Deploy Kueue cluster automatically during REANA deployment when the admin chooses to use Kueue in their Helm values file. Allow basic parametrisation of Kueue deployment by overriding Helm values and/or chart snippets.

Production Instance Considerations:

For production deployments, creating a Kubernetes cluster (e.g., at CERN or Google Cloud) is necessary. While ideally Kueue deployment would be automated through the Helm chart, limitations might exist until a dedicated Kueue Helm chart is available. As a temporary solution, Kueue might need to be manually deployed until a more integrated approach can be implemented.

Kueue Integration Milestone 3 Discussion here