kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.62k stars 1.63k forks source link

[feature] Posibility to set tolerations for driver pods #11309

Open jan-stanek opened 1 month ago

jan-stanek commented 1 month ago

Feature Area

/area backend

What feature would you like to see?

It would be great to have posibility to set tolerations also for container-driver and dag-driver.

What is the use case or pain point?

We have a cluster where all nodes are tainted, so we are not able to execute any pipeline.

Is there a workaround currently?

It is not possible, we have to use v1 pipelines.


Love this idea? Give it a 👍.

gregsheremeta commented 1 month ago

The interface for controlling driver is typically environment variables. Would you want to set driver tolerations in pipeline code, or would you be ok with an env var on the apiserver deployment that applied to all runs across all pipelines? I ask because from a user api perspective, I don't love leaking things about driver into pipeline code.

jan-stanek commented 1 month ago

Env var on the apiserver deployment is enough

jan-stanek commented 4 days ago

@gregsheremeta do you know if it can be added in this project or argo workflows has to be changed too?

gregsheremeta commented 7 hours ago

There would be no need to make any Argo Workflows modifications. It's probably as simple as modifying this template, but I didn't look too closely.