kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.59k stars 1.62k forks source link

[Proposal] KFP Components Repository #3680

Closed dalxds closed 6 months ago

dalxds commented 4 years ago

Since each Component is encapsulated with predefined input and output (as described in the docs), a central repository for KFP components could be created (npm like), thus enabling components sharing and reusability amongst KFP users. Components could be then added to the pipeline using CLI and be configured (i.e. parameters) using YAML.

Ark-kun commented 4 years ago

a central repository for KFP components could be created

Having a central component discovery solution is a good idea.

The users can already upload the components to the Google AI-Hub, but the experience can be improved.

components sharing and reusability amongst KFP users

It seems to me that the components can already be shared using different repositories and then loaded to use in the pipeline. Is there something missing?

Components could be then added to the pipeline using CLI

What is the proposed experience? What's the main difference compared to just adding a load_component_from_url line? automl_create_model_for_tables_op = load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/b3179d86b239a08bf4884b50dbf3a9151da96d66/components/gcp/automl/create_model_for_tables/component.yaml')

configured (i.e. parameters) using YAML

What is the proposed experience? What's the main difference compared to pass arguments to the component in a python pipeline or creating and using a graph component? https://github.com/kubeflow/pipelines/blob/03da0a2cce3f468c9fd458121f32e6071fcb10ec/sdk/python/tests/components/test_data/retail_product_stockout_prediction_pipeline.component.yaml#L72

dalxds commented 4 years ago

The users can already upload the components to the Google AI-Hub, but the experience can be improved.

It seems to me that the components can already be shared using different repositories and then loaded to use in the pipeline. Is there something missing?

From what I understand from the docs, you can't upload to AI-Hub publicly, only to share with colleagues in your organisation. What I'm proposing is a public repository for ML stuff written as KFP Components (including models, data preprocessing, serving, etc), so that someone can find and combine many Components to create a pipeline in a very easy and user friendly way.

Components could be then added to the pipeline using CLI

What is the proposed experience? What's the main difference compared to just adding a load_component_from_url line? automl_create_model_for_tables_op = load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/b3179d86b239a08bf4884b50dbf3a9151da96d66/components/gcp/automl/create_model_for_tables/component.yaml')

No difference. This would work great.

configured (i.e. parameters) using YAML

What is the proposed experience? What's the main difference compared to pass arguments to the component in a python pipeline or creating and using a graph component?

https://github.com/kubeflow/pipelines/blob/03da0a2cce3f468c9fd458121f32e6071fcb10ec/sdk/python/tests/components/test_data/retail_product_stockout_prediction_pipeline.component.yaml#L72

What I had in mind is for someone to be able to create a pipeline without having to write ML code, only by adding Components from the repository and providing the parameters. Maybe a graph component could be auto-generated when adding Components?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

Bobgy commented 4 years ago

/reopen /lifecycle frozen

I think we want to keep this open

k8s-ci-robot commented 4 years ago

@Bobgy: Reopened this issue.

In response to [this](https://github.com/kubeflow/pipelines/issues/3680#issuecomment-673958319): >/reopen >/lifecycle frozen > >I think we want to keep this open Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
Ark-kun commented 4 years ago

What I had in mind is for someone to be able to create a pipeline without having to write ML code, only by adding Components from the repository and providing the parameters. Maybe a graph component could be auto-generated when adding Components?

This sounds intriguing. Can you describe proposed experience step by step.

  1. User starts with empty pipeline
  2. User does X and Y happens ...
rimolive commented 6 months ago

Closing this issue. No activity for more than a year.

/close

google-oss-prow[bot] commented 6 months ago

@rimolive: Closing this issue.

In response to [this](https://github.com/kubeflow/pipelines/issues/3680#issuecomment-2016819467): >Closing this issue. No activity for more than a year. > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.