flyteorg / flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
https://flyte.org
Apache License 2.0
5.82k stars 660 forks source link

[Core feature] Namespace configuration for Ray plugin #3886

Open aybidi opened 1 year ago

aybidi commented 1 year ago

Motivation: Why do you think this is important?

At my organization, we have different GKE clusters. One is primarily used for orchestration (of Flyte workflows). Others are used for tasks' execution (here I'm specifically talking about our Ray GKE cluster). We have a Ray GKE cluster that users create a Ray cluster in and run Ray jobs.

Currently, the Flytepropeller (deployed on the Flyte GKE cluster) can submit the RayJob and RayCluster CRs to Kuberay (deployed on the Ray GKE cluster).

However, the CRs expect to create the resources (pods, etc) in the same namespace that is used in the Flyte GKE cluster -- <project>-<domain> (for example, recommender-development). We organize the namespaces differently on the Ray GKE cluster, so the same namespace that exists on the Flyte GKE cluster may not exist on the Ray GKE cluster.

Is there a way to make namespaces configurable on the user side (flytekit SDK) when a task is executed on a different GKE cluster than the one where Flytepropeller is deployed on?

Goal: What should the final outcome look like, ideally?

When a task is executed on a different K8s cluster than the one Flytepropeller is deployed on, users should be able to configure the namespace that is to be used on that cluster.

Describe alternatives you've considered

Currently, we are experimenting with using the OSS Ray plugin, so we manually create the namespace on the Ray GKE cluster if it doesn't exist already. This is not sustainable, though.

We are also considering building our own Ray plugin to get more control over these configurations. However, this would involve significant investment from us.

Propose: Link/Inline OR Additional context

No response

Are you sure this issue hasn't been raised already?

Have you read the Code of Conduct?

pingsutw commented 1 year ago

Do you want all Rayjobs running in the same namespace? If so, you can add the config to the propeller side, and then update the namespace in the objectMeta.

aybidi commented 1 year ago

Not exactly in the same namespace. When a team runs a Flyte workflow with a Ray task, the simple tasks run in pods in <project>-<domain> style namespace on the Flyte GKE cluster. The Ray task, however, can run on a different GKE cluster (this PR added this functionality). So the user should be able to provide a custom namespace when executing a task on a different GKE cluster.

The other GKE cluster may not have the <project>-<domain> namespace present

github-actions[bot] commented 7 months ago

Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable. Thank you for your contribution and understanding! 🙏