kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.45k stars 428 forks source link

Katib Python SDK Specify Volume Mounts #2247

Open bwhartlove opened 6 months ago

bwhartlove commented 6 months ago

/kind feature

Describe the solution you'd like I'd like to be able to specify default volume mounts for secrets into each of trial pod that runs during an experiment. As far as I can tell from the documentation and my experiments thus far, there is no way to achieve this. However, if I am missing something I would greatly appreciate some clarification. Thanks!

Anything else you would like to add:


Love this feature? Give it a 👍 We prioritize the features with the most 👍

tenzen-y commented 6 months ago

You can specify the volume mounts via create_experiment API.

https://github.com/kubeflow/katib/blob/46173463027e4fd2e604e25d7075b2b31a702049/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py#L78

bwhartlove commented 6 months ago

Thanks for the info! In order to do this, do I need to write a YAML spec? I have not found any examples online about mounting an existing secret into the experiment trial pods. From what I can discern given the documentation, I would need to write a TrialSpec YAML to achieve this. I was hoping to have the ability to simply specify a volume mount directly in the experiment creation function.

tenzen-y commented 6 months ago

Thanks for the info! In order to do this, do I need to write a YAML spec?

Yes, we need to write the Experiment YAML spec.

I was hoping to have the ability to simply specify a volume mount directly in the experiment creation function.

I see. It might be useful to be available to specify the volume configuration via the tune API:

https://github.com/kubeflow/katib/blob/46173463027e4fd2e604e25d7075b2b31a702049/sdk/python/v1beta1/kubeflow/katib/api/katib_client.py#L135

@johnugeorge @andreyvelich WDYT?

andreyvelich commented 6 months ago

@bwhartlove You can check this example on how to use create_experiment() API which gives you access to all Experiment and Trial Spec: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/sdk/cmaes-and-resume-policies.ipynb

If you need to set volume mount for your Trials, you can add it to the trial_spec field.

If we want to provide simple argument in the tune function to add volume mounts, do you have any ideas how we should do it ? That should be very simple for user to understand without prior knowledge of Kubernetes Pod specs. For example, for Trial environments user can set it via dictionary or as Kubernetes V1EnvFromSource parameter. cc @droctothorpe

bwhartlove commented 6 months ago

@andreyvelich Thanks for the reply and the information. I can't say I am familiar enough with the underlying architecture to provide a clear solution to the issue. I just imagined having another parameter to the tune function that allowed you to specify volumes for the trial pods. Kubeflow has an API for this with their pipelines, and it would be nice to have a similar function in Katib.

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

andreyvelich commented 3 months ago

Maybe it worths to re-use similar storage_config like we did in train API.. E.g. we can name it as: trial_storage_config. Any other suggestions are welcome!

Any thoughts @tenzen-y @droctothorpe @johnugeorge @bwhartlove ?

/help /good-first-issue

google-oss-prow[bot] commented 3 months ago

@andreyvelich: This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-good-first-issue command.

In response to [this](https://github.com/kubeflow/katib/issues/2247): >Maybe it worths to re-use similar `storage_config` [like we did in `train` API.](https://github.com/kubeflow/training-operator/blob/bb8bba00ff0b48de922c523b0d3051f8b2d4ee74/sdk/python/kubeflow/training/api/training_client.py#L102-L106). >E.g. we can name it as: `trial_storage_config`. >Any other suggestions are welcome! > >Any thoughts @tenzen-y @droctothorpe @johnugeorge @bwhartlove ? > >/help >/good-first-issue Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
andreyvelich commented 3 months ago

/area sdk

Electronic-Waste commented 3 months ago

I can help with this issue :)

Electronic-Waste commented 3 months ago

/assign

Electronic-Waste commented 1 week ago

Maybe it worths to re-use similar storage_config like we did in train API..

Since train API always mount PVC to pods and usually allocates 10GiB for datasets and model, it seems that we can't simply reuse the volume mounting logic in train API. WDYT👀 @andreyvelich