kubeflow / pipelines

Machine Learning Pipelines for Kubeflow
https://www.kubeflow.org/docs/components/pipelines/
Apache License 2.0
3.61k stars 1.62k forks source link

[feature] Reintroduce add_resource_limit to kfp v2 SDK or another resource requests/limits mechanism #10996

Open dariuszg-gc opened 4 months ago

dariuszg-gc commented 4 months ago

Feature Area

/area sdk

What feature would you like to see?

Reintroduction of add_resource_limit and add_resource_request from v1 to v2 SDK.

https://github.com/kubeflow/pipelines/blob/38ef986eaa00e8e8e634a17a7837111b6380685a/sdk/python/kfp/deprecated/dsl/_container_op.py#L268

https://github.com/kubeflow/pipelines/blob/38ef986eaa00e8e8e634a17a7837111b6380685a/sdk/python/kfp/deprecated/dsl/_container_op.py#L281

OR another mechanism to allow flexible specification of multiple resource requests/limits

What is the use case or pain point?

Currently the only mechanism to achieve setting resource limit is available via the set_accelerator_type and set_accelerator_limit methods (with intention to use it for GPU/TPU devices), however it allows to specify them only for a singular resource.

There are several use cases that require usage of multiple DevicePlugin/DRA managed devices (e.g GPU + RDMA RNIC), thus the need for a mechanism that allows adding multiple resource limits/requests to the pod spec.

Is there a workaround currently?

None (v2) or use of deprecated version.


Love this idea? Give it a 👍.

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dariuszg-gc commented 1 month ago

/remove-lifecycle stale