OpenAccess-AI-Collective / axolotl

Go ahead and axolotl questions
https://openaccess-ai-collective.github.io/axolotl/
Apache License 2.0
6.83k stars 749 forks source link

Hyperparameter optimization CLI #1356

Open casper-hansen opened 4 months ago

casper-hansen commented 4 months ago

⚠️ Please check that this feature request hasn't been suggested before.

🔖 Feature description

Create a new CLI that allows users to perform hyperparameter optimization to get the most optimized training runs.

✔️ Solution

Using Optuna could be a solution. https://github.com/optuna/optuna

❓ Alternatives

No response

📝 Additional Context

No response

Acknowledgements

ehartford commented 4 months ago

what would this look like in axolotl, specifically? new module? arguments? what would the output look like?

casper-hansen commented 4 months ago

@ehartford you can sort of optimize three things in general:

  1. learning rate
  2. batch size
  3. scheduler

You should specify an axolotl config. Additionally, you need extra arguments to specify the search space you want to optimize within. For example, you could specify a range 0.0005 - 0.01 to optimize the learning rate.

A good reference: https://shengdinghu.notion.site/MiniCPM-Unveiling-the-Potential-of-End-side-Large-Language-Models-d4d3a8c426424654a4e80e42a711cb20

ehartford commented 4 months ago

Yes - I was asking how you imagine the feature fitting into axolotl's architecture and workflow

ehartford commented 4 months ago

Let's imagine that Axolotl had the feature.

What would the command / arguments be to invoke it? Would it happen in the course of training or as a separate manually launched event? Would it use arguments or a config file? What changes to the config file schema?

etiennebonnafoux commented 3 months ago

Hi everybody, I guess there should be a balise 'use_optuna' If it is set to True, you could put a range for learning rate (and not int) and a list for learning rate scheduler. So everything is in just in the config file.