kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.62k stars 1.63k forks source link

Refactor ResourceOp to be implemented on top of ContainerOp #4083

Open Ark-kun opened 4 years ago

Ark-kun commented 4 years ago

Part of the efforts to make the DSL orchestrator-agnostic. We're trying to balance between making KFP feature-rich and locking in our customers to a particular orchestrator. Although there are gaps, we're trying to have an abstraction layer that allows changing the implementation.

ResourceOp is an Argo-specific feature, but it's pretty easy to just implement it as a container. In fact Argo itself just uses a container with launcher that executes kubectl. Controlling the code will allow us to add more features for manipulating Kubernetes resources.

Switching ResourceOp to just be a ContainerOp factory will significantly simplify the compiler and the DSL (merging BaseOp back into ContainerOp).

It will also allow VolumeOp etc to be implemented as proper components (ContainerOp or ResourceOp are not components). This will allow including them in portable graph components.

We'd like to make this move to reduce Argo exposure and improve portability.

animeshsingh commented 4 years ago

thanks @Ark-kun. We think this is the step in the right direction, and these common functionalities should be moved to KFP itself, as opposed to having a dependency on the underlying backend engine.

we have done our own implementation in case of Tekton cc @Tomcli

Tomcli commented 4 years ago

Here is our container for replicating the same ResourceOp behavior as Argo https://github.com/kubeflow/kfp-tekton/tree/master/tekton-catalog/kubectl-wrapper

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

akbari-ali commented 3 years ago

@Ark-kun I was wondering if any work is going on this issue. We are using ResourceOp to create a spark job (via sparkOperator) in the pipeline, which has many features missing (outputs, limited job status, caching control). While we are solving some of these limitations, it is very useful to have a more standard approach.

GezimSejdiu commented 3 years ago

Hi team,

I also see the benefit of streamlining ResourceOp via ContainerOp or similar approaches (e.g. implement some of the features BaseOp do have to ResourceOp as well) so that jobs that must be rendered via yaml (e.g. Spark job) benefit of these features. One of the important features is caching (as that is enabled by default) and it can't be disabled via set_caching_options even when set to False -- or I'm not able to see that taking effect. Can someone confirm this? When doing:

spark_job_op.set_caching_options(False)

it still uses the value from the previous run.

Best regards,

GezimSejdiu commented 1 month ago

Hey there,

any updates on this? I know that the kubeflow community worked a lot to bring spark-on-kubernetes integration: https://www.kubeflow.org/docs/components/spark-operator/overview/ to the grade which can be used closely with kubeflow. Do we have or any plans to make this first citizen of the kubeflow pipeline instead of using ResourceOp or similar approaches. Any such functionality which wrap them would be beneficial.

Thanks a lot for the great work.