Feature/Tutorial Request: Hyperparameter tuning

Evovest / EvoTrees.jl

Boosted trees in Julia

https://evovest.github.io/EvoTrees.jl/dev/

Apache License 2.0

175 stars 21 forks source link

Feature/Tutorial Request: Hyperparameter tuning #257

Open ParadaCarleton opened 1 year ago

ParadaCarleton commented 1 year ago

Grad student descent is definitely not fun, so it would be very nice to have a way to tune hyperparameters efficiently, and a tutorial on how to do this. (MLJTuning.jl lets you do it in theory, but only provides a handful of black-box optimizers like random or grid search.)

jeremiedb commented 12 months ago

Are there specific hyper tuning methods you'd like to see covered? With regard to demonstation with internal EvoTrees API, I'd tend to recommend a simple random search. And for more specific tuning technics, I'd tend to favor developing them in a mostly algo agnostic way. MLJ seems like a good target in that regard. Were you seeing reasons to build a more elaborate hyper tuning wihtin a specific algo?

ParadaCarleton commented 12 months ago

Are there specific hyper tuning methods you'd like to see covered?

Mostly just a gradient method for the continuous parameters. Grid search should be fine for the discrete hyperparameters, given there's only 1 or 2.

jeremiedb commented 11 months ago

Could you precise the nature of the hyper search you're envisioning? I'm not clear how a gradient method could be applied here for hyper-search as the an EvoTree loss function isn't differentiable with respect to its hyper-parameter. Perhaps you're referring to apply a gradient method to eval metric outcomes to inform on next hyper candidate to test? Other than Random search, my undertanding is that bayesian search may be the other most useful approach, but I may well have blind spots in my portrait of the hyper-tuning landscape.

ParadaCarleton commented 11 months ago

Whoops, this is supposed to be in EvoLinear.jl :sweat_smile:

(Although, I thought the loss was differentiable with respect to lambda? But I might be mixing that up with some other decision tree algorithm.)

jeremiedb commented 11 months ago

Even in the context of EvoLinear, I'm not understanding the applicability of a gradient method for hyper params tuning. Would you have an example (package/paper) of what you're trying to achieve? Hyper param tuning is typically about figuring a hyper-param that leads to better generalisation on an out-of-sample dataset. In that context, I have difficulty to see how the feedback from the out-of-sample may be used to infer a udate to the hyper-param. Taking a minimal use case, linear regression with L2 regularization, how would L2 be updated from the out of sample metric?