kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.63k stars 1.63k forks source link

[feature] Allow apache-beam version greater than 2.50.0 on dataflow component (Vertex AI) #11017

Closed caetano-colin closed 1 month ago

caetano-colin commented 4 months ago

Feature Area

/area sdk /area components

What feature would you like to see?

Support for more recent apache-beam versions on Google Cloud Dataflow Component (https://cloud.google.com/vertex-ai/docs/pipelines/dataflow-component)

What is the use case or pain point?

Currently, the apache beam version being used for the google cloud pipeline component is 2.50.0, which Google Cloud Dataflow will deprecate on August 30, 2024 and has known issues (https://cloud.google.com/dataflow/docs/support/sdk-version-support-status).

The dockerfile for the image gcr.io/ml-pipeline/google-cloud-pipeline-components:2.15.0 seems to be: https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/Dockerfile#L38

Is there a workaround currently?

DataflowPythonJobOp does not seem to have a field for replacing custom images.

There is a field for passing a requirements.txt file, which would probably work if the container running it has network access. However, on secure/isolated environments, where the docker images must have been previously built, the container would not have access to the PyPi repository, therefore it will not be able to download packages specified in that file. In that case, the user would have no choice but to use 2.50.0 version.


Love this idea? Give it a 👍.

rimolive commented 4 months ago

/cc @zijianjoy @chensun @connor-mccarthy @james-jwu

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] commented 1 month ago

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.