lucidrains / x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers
MIT License
4.63k stars 395 forks source link

Idea: hyperparameter searching #181

Open pfeatherstone opened 1 year ago

pfeatherstone commented 1 year ago

This library offers so many useful parameters to tweak your architecture. However, though @lucidrains offers insights from the papers and from experience, what works and what doesn't work ultimately depends on your data and your compute. It's a bit daunting trying to figure out how to tune your model. Rather than doing a blind manual search, maybe a hyperparameter searching algorithm would be a good idea. Maybe something like a genetic algorithm or similar.

pfeatherstone commented 1 year ago

Or maybe use something like https://github.com/optuna/optuna

pfeatherstone commented 1 year ago

This probably doesn't need to be added to the library. But maybe an example snippet or jupyter notebook would be cool. I might try it at some point. If i have some success, I'll submit a PR.