flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.71k stars 640 forks source link

When running in a k8s cluster with mutating webhooks, tasks don't finish since the side car is still working [Core Feature] #1564

Open NotMatthewGriffin opened 3 years ago

NotMatthewGriffin commented 3 years ago

Motivation: Why do you think this is important? Currently when running a flyte task in a kubernetes cluster that uses mutating webhooks to add sidecars to all pods, the task and associated pod will run until terminated in the flyte web console. Inside of the pod, the container actually running the flyte task has completed and exited with a good status but the sidecar container is still running. To work around this a user must specify the task in their workflow as a pod task like @task(task_config=Pod(pod_spec=pod_spec, pimary_container_name="container_name")). This means that users need to know the configuration of the cluster ( like will any sidecars be added automatically) in order to successfully create/run a flyte workflow.

Goal: What should the final outcome look like, ideally? Ideally a user of flyte wouldn't need to declare their task as a pod task in order to run it in a kubernetes cluster that uses mutating webhooks to add sidecars to pods. The user could declare their task like @task and it would run the flyte task pod until the flyte task container finishes then the rest of the workflow would be run.

Describe alternatives you've considered The only other alternative I've considered is continuing to use the workaround of declaring all workflows/tasks as pod tasks.

welcome[bot] commented 3 years ago

Thank you for opening your first issue here! πŸ› 

kumare3 commented 3 years ago

@NotMatthewGriffin thank you for the issue. I have heard from a few users about this issue. The only problem is there are a few folks that run @task on a non k8s cluster and the container abstraction keeps it simple. But, it is completely possible to ensure that on K8s we can assert that the main container exit is the only signal that is needed to mark completion.

How urgent do you think this work is?

kumare3 commented 3 years ago

can we slot this for november?

NotMatthewGriffin commented 3 years ago

@kumare3 I don't think this issue is show stopping for anyone since we know of the workaround. Mostly its just a little inconvenient to require extra configuration for running in some k8s clusters and not others.

I would be happy if this was addressed in November :)

If this is something I could help address I would also be happy to help.

github-actions[bot] commented 1 year ago

Hello πŸ‘‹, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! πŸ™

github-actions[bot] commented 1 year ago

Hello πŸ‘‹, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! πŸ™

github-actions[bot] commented 2 months ago

Hello πŸ‘‹, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! πŸ™