automl / amltk

A build-it-yourself AutoML Framework
https://automl.github.io/amltk/
BSD 3-Clause "New" or "Revised" License
66 stars 6 forks source link

[Decision] Behaviour of batched `ask()` #227

Open eddiebergman opened 9 months ago

eddiebergman commented 9 months ago

If asking for a batch of configs as implemented in PR #224, there is no hard requirement that all n configs asked for should be unique. In theory I don't see a use case where duplicates is allowed, but it's often a much harder requirement for optimizers to meet.

Right now, asking for n configs is ambiguous w.r.t. property of uniqueness, although a brief test of SMAC and Optuna with a batch size of n=100 without having any tell() seems to suggest that all 100 configs returned were unique (search space was a float in (1.0, 10.0)), meaning they do not return duplicates.

So

@KEggensperger @mfeurer

One a decision is made, this unblocks #226

LennartPurucker commented 9 months ago

My two cents: by default, a batch of ask should definitely be unique from a user perspective.

Otherwise, the user would need to implement caching and uniqueness checks and call ask in a while loop until it reaches the desired amount of config. This is not something I would like to do.

I think it should be a special case if the optimizer wants to have it in a different way. We could support optimizers by simply filtering the non-unique configs and returning some values twice, but this will break parallelization efficiency if you, for example, ask for 100 but only get 80 back.

LennartPurucker commented 9 months ago

One more comment: maybe make seed/randomness a separate "thing" if that is the case for needing duplicates. Maybe randomness should not be part of the search space/config in the first place...

eddiebergman commented 9 months ago

mfeurer: This is really tricky and if it makes sense to allow this depends on the optimizer. Some optimizers can make use of the fact that a configurations was evaluated more than once, some cannot.

For the case of SMAC with repeats and MF algorithms where the config is the same but it wants to be evaluated to a different fidelity, then the Trial itself will differ and hence they are considered unique.

My two cents: by default, a batch of ask should definitely be unique from a user perspective.

Otherwise, the user would need to implement caching and uniqueness checks and call ask in a while loop until it reaches the desired amount of config. This is not something I would like to do.

I think it should be a special case if the optimizer wants to have it in a different way. We could support optimizers by simply filtering the non-unique configs and returning some values twice, but this will break parallelization efficiency if you, for example, ask for 100 but only get 80 back.

Yup, I've run into this pain before as well and hence the reason why I am leaning towards it's required.

One more comment: maybe make seed/randomness a separate "thing" if that is the case for needing duplicates. Maybe randomness should not be part of the search space/config in the first place...

The benefit of uniqueness is that we don't need the seed during sampling. What I mean by that is that the optimizer is already seeded at creation (usually). In practice, I don't think many optimizers support dynamic seeding during ask(). tldr; I'm not sure seeding needs to be considered here but I could be wrong.


So one possible solution is to make the decision that:

LennartPurucker commented 9 months ago

Note that this does not mean that [opt.ask(10), opt.ask(10)] is necessarily unique.

We do not necessarily need to check, although I would like to have this be something I can rely on. IMO, this seems to be a bug in the optimizer if [*opt.ask(10), *opt.ask(10)] is not unique.

eddiebergman commented 9 months ago

Depends, you could also argue that optimizer.ask(n) should be consistent given it's current state and no new information, I do not see this as a bug but more a design choice. However it can really be argued both ways depending if you consider "whats already been asked" as part of the optimizers state.

LennartPurucker commented 9 months ago

Depends, you could also argue that optimizer.ask(n) should be consistent given it's current state and no new information, I do not see this as a bug but more a design choice. However it can really be argued both ways depending if you consider "whats already been asked" as part of the optimizers state.

Ah, my bad, absolutely. I somehow assumed we called tell in between.

What I would like to have is the following is unique:

list_a = opt.ask(n=10)
opt.tell(eval(list_a))
list_b = opt.ask(n=10)

# list_a and list_b should be unique at this point. 
aron-bram commented 9 months ago

Hey, I would also expect it to return unique configs. 1) However, could it make sense to add a replace parameter to ask(), which by default =False, that controls this behaviour? 2) And/Or possibly an attribute to a class that controls sampling on an instance level so that the above case can also be handled?

1:

list_a = opt.ask(n=10, replace=False)
opt.tell(eval(list_a))
list_b = opt.ask(n=10, replace=False)
# list_a + list_b CAN have duplicates, but both lists contain unique elements

2:

opt.replace = False
list_a = opt.ask(n=10)
opt.tell(eval(list_a))
list_b = opt.ask(n=10)
# list_a + list_b contain unique elements

Essentially allowing for sampling with or without replacement, either for just a single call to ask or throughout all the calls to ask.

eddiebergman commented 9 months ago

I like the idea if we had full control over the optimizers but it's more the issue that we are just wrapping existing optimizers, and so we can't control uniqueness in terms of what they provide internally. For example, I don't know how I would implement replace=True for SMAC or the other optimizers. It also puts a lot more effort into integrating an optimizer. to support both modes. However replace=True/False makes a lot of sense when considering some implementation of RandomSearch. I've sidestepped this problem by not providing a RandomSearchOptimizer (for various other reasons, such as supporting various search space implementations)


What we're looking for is a contract for integrated optimizers, not functionality to control this behaviour. Something we can put in a docstring to say, this is what you can expect, and then we make sure the existing optimizers respect that contract. I would ideally like to codify this in a test such that we assert this behaviour is True

So far, as it seems, the existing optimizers respect uniqueness per batch, i.e. ask(n, replace=False), at least up to batch size 100 for a single continuous parameter.

aron-bram commented 9 months ago

Okay, I see what the problem is here. Thanks for the extra context!

eddiebergman commented 9 months ago

From what I gather it seems that expected behaviour would be for unique trials to be suggested.


All integrated optimizers must be tested for this behaviour with and this be documented. This contract does not extend to custom optimizers that people can implement. It is then dependant upon their implementation.