kubeflow / katib

Automated Machine Learning on Kubernetes
https://www.kubeflow.org/docs/components/katib
Apache License 2.0
1.45k stars 425 forks source link

[GSoC]Create LLM Hyperparameters Optimization API Proposal #2333

Open helenxie-bit opened 1 month ago

helenxie-bit commented 1 month ago

What this PR does / why we need it: Give user functionality to tune HyperParameters of LLMs using simple Python SDK APIs

Which issue(s) this PR fixes _(optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged)_: Fixes #

google-oss-prow[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign johnugeorge for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/kubeflow/katib/blob/master/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
andreyvelich commented 1 month ago

/area gsoc

andreyvelich commented 1 month ago

Ref issue: https://github.com/kubeflow/katib/issues/2291

helenxie-bit commented 1 week ago

@andreyvelich I have removed the objective function and modified the API design based on our discussion.

In the updated version, users must provide parameters such as model_provider_parameters, dataset_provider_parameters, and trainer_parameters to the tune API. The hyperparameter search space is now defined within trainer_parameters for ease of use. The tune API will then download the pretrained models and datasets using storage_initializer, and create the experiment and trials automatically.

Please review the changes and let me know if you have any feedback or suggestions!

johnugeorge commented 10 hours ago

Other than consistent naming, it looks good