Hyperparameter Tuning - Githubissues

robme-l commented 1 year ago

I noticed the Hyperparameter Tuning portion of RiverML is marked ToDo. With the exception of SuccessiveHalving, does anyone have any libraries that are easily compatible with River? I found SpotPython but it takes a lot of setup to get going, and after trying to use FLAML's ChaCha model I ran into issues. These are realistically the only ones I found, however with the exception of RiverML's own library I am unfamiliar with the others and most hyperparameter tuning in general. Would appreciate any community guidance.

robme-l commented 1 year ago

I am adding on that given River's well-crafted and easy-to-use nature, perhaps we can integrate with something like Optuna which I think would cover a lot of River's hyper-parameter tuning bases and allow people to get started.

MaxHalford commented 1 year ago

Hey there @robme-l :)

Yep I need to finish that TODO. Basically we have all the methods located in the model_selection module. The way hyperparameter tuning is done in River is to have several models with different hyperparams running in parallel. A meta-model decides which model is the best and which ones to train. There's different meta-models: greedy, successive halving, bandits.

The other approach is actually modifying the hyperparams on the fly. We don't do this in River. But there are two third parties I'm aware.

First, there's SpotPython. The people it are very nice. I actually recently met them, and I know they are working on improving their UX. You can reach out to them if you want to.

Second, there's SSPT, which you can find here. You can ask @smastelini about it.

Sorry for the short answer, I'm rather busy! But feel free to ask more questions.

robme-l commented 1 year ago

@MaxHalford not at all, your feedback is very much appreciated, if there is something I can help with let me know! As great SpotPython is, it seems like it is made for those who are well accustomed to hyper-parameter tuning besides the basics (which some of us are still learning I'm afraid). I took a look at SSPT and it seems promising but I am not entirely sure how to use it appropriately, maybe @smastelini could shed some light.

Finally, I have looked around and optuna actually seems like a great library to pair with river for both bayesian, genetic algorithmic, and other hyper-parameter optimization techniques. It feels fairly approachable even to those learning. The only drawback is of-course, it's not as streamlined as River's beautifully crafted built-in model-selection techniques in terms of interface (for example SuccessiveHalving is like plug and play with the way you specify hyperparameters).

The only additional question I have is whether we can specify hyperparameters as generators or if a pythonic list is required? I say this for larger search spaces that might not fit into memory when factoring in all the other things the computer has to handle. Thanks again!

smastelini commented 1 year ago

Hi @robme-l, thanks for reaching out! SSPT is a research product of professor João Gama's team led by @BrunoMVeloso. The great selling point is that the models are actually expected to evolve. So, the best hyperparameter set is expected to change over time. This is different from a performing, let say, a race where the best hyperparameter set wins and that's it. SSPT is also designed to work with multiple learning tasks and river pipelines.

SSPT is currently fairly functional (last time I checked) and Bruno and his team are working on new and improved versions of the core hyperparameter search algorithm. Nonetheless, there is still work to do mainly in two aspects:

The interface: right now, you need to provide a dictionary with information about the parameters to tune. Not so bad, but if think about nested hyperparameters this becomes a bit clunky. An example would tuning a linear model whose optimizer also has hyperparameters to tune.
(This one is more cosmetic and not directly visible for users) Handling parameter hierarchy: the way we traverse the class parameters is not optimal. Ideally, I would want to make it closer to what happen in compose.Pipeline, yet keeping it generic. @MaxHalford maybe could give me some pointers in this front, as you wrote most of the pipeline handling stuff.

@robme-l , I am sure @BrunoMVeloso can give you lots of references for learning and maybe even some code examples, so I am pinging him here.

MaxHalford commented 1 year ago

@robme-l have you seen expand_param_grid? It gives you a pretty succinct way to specify hyperparams and instantiate several copies of a model.

The only additional question I have is whether we can specify hyperparameters as generators or if a pythonic list is required? I say this for larger search spaces that might not fit into memory when factoring in all the other things the computer has to handle. Thanks again!

Yes you pretty much have to hold the models in memory. The reason for this is because we believe in online tuning: the goal is not to find the best parameters offline and then use those online. Instead, we want the tuning to be done online, and for that the models need to be always accessible. Of course, we could imagine some kind of system where the models are located on disk. Something like the shelve module could be good for this. I would be surprised though if you get in a situation where the models don't fit in memory. You must have a lot of hyperparams to try out! Let me know if that's the case; if so I can try to implement something quickly.

robme-l commented 1 year ago

@MaxHalford not at all, I was just curious about the implementation specifics and the philosophy behind it. River has become my favorite ML library because of its online and flexible nature. However I imagine knowing what you just mentioned can be helpful when making design decisions in the future.

I am also asking these questions because for my use case I created a collection of classes (dare I say a library), with the intent of automating model creation/hyperparameter tuning/model selection/ensemble creation for a particular dataset. Sort of like the 'Tracks' shown on the 'Benchmarks' section, however to be able to also apply on multiple datasets. Although I originally wrote it to automate my workflow, I think with a bit of refactoring it would be a very nice nice-to-have in River.

robme-l commented 1 year ago

Following up to see if @BrunoMVeloso saw this thread, would really appreciate any references related to SSPT and see how integration with River works!

bartzbeielstein commented 1 year ago

Hi everybody interested in Hyperparameter-Tuning. Based on feedback from @MaxHalford I tried to simplify the spotRiver interface to make it more user-friendly. Please note: spotRiver is about off-line tuning, not online-tuning. I attached a very, very draft demo that shows the direction spotRiver is heading to. I would be happy to receive your feedback and add add features accordingly. Best, Thomas.

https://github.com/online-ml/river/assets/32470350/51dd82f3-840b-4166-83bd-d14277849730

BrunoMVeloso commented 1 year ago

Hi @robme-l

We are currently working in some extra online optimization code... We developed a second version of our optimiser that uses micro evolutionary algorithms (the paper was accepted on the journal track of ecml PKDD 2023). I hope that on the next few weeks the paper will be published on DAMI Journal and we will merge the code for the riverml extra. Regarding SSPT, you can find a good description of the method on: https://www.sciencedirect.com/science/article/pii/S1566253521000841

The code was initially developed for the MOA framework and our team (including @smastelini ) tried to adapt riverml code to work with our method. The code need some adjustments/optimizations... but it works =)

If you have any questions please let me know.

Best Bruno

online-ml / river

Hyperparameter Tuning #1352