kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.62k stars 1.63k forks source link

[feature] Add ability to parameterize container images #11391

Open HumairAK opened 2 days ago

HumairAK commented 2 days ago

Feature Area

/area sdk

What feature would you like to see?

As a user of KFP sdk I want to be able to provide container images as pipeline parameters.

What is the use case or pain point?

There are a number of use cases for this, one example might be requiring a specific image for a specific accelerator type. Having to create multiple pipelines just for different container images is unnecessarily redundant.

Is there a workaround currently?

No

Also note this is a regression from v1.

Previous discussion on this topic: https://github.com/kubeflow/pipelines/issues/5834 It was decided not to add this for v2 due to technical hurdles and concerns around component interface breaking. Let's re-evaluate.


Love this idea? Give it a 👍.

HumairAK commented 1 day ago

I'm thinking we do something like this:

@dsl.component(base_image="docker.io/python:3.9.17")
def empty_component():
    pass

@dsl.pipeline(name='pipeline-accel')
def pipeline_accel(img: str):
    task = empty_component()
    task.set_container_image(img) # overwrite base_image="docker.io/python:3.9.17"

Which would require us to treat the container spec image as a pipeline channel

It looks like we can just follow the same pattern we have followed for setting accelerators https://github.com/kubeflow/pipelines/blob/64e390069d6c60c97ea03e833529a0930398620f/sdk/python/kfp/compiler/pipeline_spec_builder.py#L618

seems pretty straight forward