mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
53 stars 5 forks source link

Default search spaces #262

Closed KnutJaegersberg closed 3 years ago

KnutJaegersberg commented 4 years ago

Hi, Im a big fan of hyperopt for old mlr. I find it gives me a function driven approach to all settings I might wanna change and at the same time allows quick prototyping with one liners. The only tuning strategies I seem to need are grid, bayesian and hyperband, though I like mlr exposes a wide range of tuning strategies. Do you plan to integrate the default search spaces / community database of that package around here?

berndbischl commented 4 years ago

I think such an option should be available in mlr3tuning. I am not sure, whether hyperopt wassnt overdoing it - and i never was a big fan of the name, as hyperopt is a different algorithm fur tuning.

give us some time to discuss this

berndbischl commented 4 years ago

I mean, it basically depends on this question: do you/we REALLY need multiple DEFAULT search spaces per learner. or is one enough? because if the answer is 1 (for people who do not want to bother with setting it up themselves, as they are none-experts) then we can simply add it to the learner?

the other option is to to provide a dictionary of search spaces

KnutJaegersberg commented 4 years ago

if it is computationally not too expensive, what about looping over (parts of) a dictionary of search spaces if you are looking for an easy way to derive good enough hyperparameters, just to have your algo quickly exploited ad hoc in a research notebook? im mostly doing some text mining and social media mining, sometimes with some open data attached. I find quick classsifiers I build often handy time savers for engagement mining even if its just for myself.

berndbischl commented 4 years ago

sorry, i dont get what you are saying

KnutJaegersberg commented 4 years ago

you can try several search spaces for the same algorithm, then pick whatever performed best is what i meant

berndbischl commented 4 years ago

you can try several search spaces for the same algorithm, then pick whatever performed best is what i meant

ok, i see, thx. but that's a really inefficient approach? like nearly always these spaces will overlap a lot?

KnutJaegersberg commented 4 years ago

very true! I dont think thats a go to approach. but if your not satisfied what you got with searcspace 1. you can try another one that is different from searchspace 1. estimating difference should be feasable?

KnutJaegersberg commented 4 years ago

I see two ways: expert community made search spaces, which you compare to one another. Or you automatically generate search spaces, with varying narrowness. im just brainstorming a bit :) sure there are smart approaches to automate search space generation and selection?

berndbischl commented 4 years ago

sure there are smart approaches to automate search space generation and selection?

well, we even wrote a paper on this, others have too, but i would say research is here really not very far advanced. a major problem is having enough experimental data, to properly estimate the space. (in addition to the problem how to do the construction "best")

KnutJaegersberg commented 4 years ago

im very much automl guy..., you know ;) I only tweak simple models myself.

this is an advanced mathematical topic, i can only play with it, not reason about it. if you generate both datasets and search spaces, can you like simulate towards best matches?

KnutJaegersberg commented 4 years ago

I see you wrote a paper about optimal defaults. Awesome!

be-marc commented 3 years ago

@KnutJaegersberg You might want to check out our new package mlr3tuningspaces. It contains search spaces from scientific papers. We are not going to implement community made search spaces again.