Closed SaschaHeyer closed 2 days ago
cc @chensun
Morning any updates?
Hi @SaschaHeyer , this is indeed a known limitation and we plan to discuss the best solution for this in Q1/Q2 2022.
Can you help us understand what's your use case to set a dynamic value for cpu limit, and how critical is this feature to you? Thanks!
Hi @chensun Thanks a lot for your feedback.
I work for one of the biggest Google Cloud partners, we get this request regularly from our customers, at least once every 2 weeks. Parameterizing the machine type (CPU and memory) can be really useful if you use the same pipeline just for different datasets and or hyperparameters (This way there is no need to re-compile).
Changing those hyperparameters also might require bigger machines. For example, if you increase the batch size.
Currently, a re-compile of the pipeline is required. Would be useful if we could do this via parameter as well.
in this line, would be nice also if when a task throws an kfp error for being out of memory, that a) you can play with the memory-limit as a parameter as SaschaHeyer request, just re-runing the task, not the whole pipeline (though if cache is enabled this maybe is already solved) b) does the upscale automatically and re-runs the task again
@SaschaHeyer Thanks for the context!
in this line, would be nice also if when a task throws an kfp error for being out of memory, that a) you can play with the memory-limit as a parameter as SaschaHeyer request, just re-runing the task, not the whole pipeline (though if cache is enabled this maybe is already solved)
Yes, caching would help here if the upstream doesn't have any changes on their inputs.
b) does the upscale automatically and re-runs the task again
This might create some surprise billing issue :)
This might create some surprise billing issue :)
yep, in case implemented, there should be an autoscale: bool = False
argument in the function kfp.v2.compiler.Compiler().compile
But agree that option b): auto-scaling could have some dramatic problems for the user in terms of money money that option a) doesn't have.
Huge +1 on this!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
are plans to support this? or is explored in another ticket?
Hi are there any updates? This would be a huge benefit for re-using pipelines without the need to re-compile them.
Hi are there any updates? This would be a huge benefit for re-using pipelines without the need to re-compile them.
+++
I agree with @SaschaHeyer, we are building reusable pipeline templates with only data changing and depending on the data size, we would want to be able to configure the CPU and Memory for each of the components through pipeline params or any other way.
Hi guys, do we have any updates on this? I am also looking for exactly the same dynamic parameterisation of my pipeline.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.
@entsarangi: You can't reopen an issue/PR unless you authored it or you are a collaborator.
This is an useful feature to have cpu_limit
available via pipeline_params ? Any update or workaround that doesn't involve hard-coded values ?
Hello Kubeflow Team, Hello Google Team,
The container OP
.set_cpu_limit
only works when the value is set explicit and not via parameter_values or at runtime https://github.com/kubeflow/pipelines/blob/4906ab2f1142043517249a62b9f22bc122971fdf/sdk/python/kfp/dsl/_container_op.py#L378Reproduce
Environment
Steps to reproduce
Not working
Working
Expected result
The CPU limits can be set via parameter_values
Looking forward to your feedback