google / vizier

Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
https://oss-vizier.readthedocs.io
Apache License 2.0
1.48k stars 96 forks source link

Dealing with duplicates and failed evaluations #1199

Open clementruhm opened 14 hours ago

clementruhm commented 14 hours ago

Hi!

For the default designers (GPUCBPEBandit and GPBandit) what is the recommended way to deal with duplicates? If the same set of parameters is suggested again? Should I: a) store metrics from previous runs and complete trials using cached metric? b) sample another trial to save time?

Is there a way to hop out cycle of the duplicated set of parameters being suggested? For example dynamically increase exploration?

Another question, for the trials that fail, how do I report back that the set of parameters is invalid? i do:

trial.complete(vz.Measurement(), infeasibility_reason="invalid_config")
designer.update(vza.CompletedTrials([trial]), vza.ActiveTrials())

Is it a valid approach? Is there a better one?

Kind regards

xingyousong commented 13 hours ago

On duplicates: Can I ask what search space you have? Duplicates occurring might make sense if the search space is a small categorical space (with finitely many possibilities), but it would be shocking if this occurred with a continuous / DOUBLE search space - i.e. floating point parameter values being exactly the same.

On failed trials: Yes, you would mark the trial as infeasible with infeasibility_reason. But from your code snippet, you seem to be using the designer in a custom loop (rather than using our client API) - can I ask the reason for this?

clementruhm commented 13 hours ago

yes, its multiple categorical or discrete parameters. The number of combinations is rather large (up to 500k). nevertheless, it seems the it converges pretty fast and tends to start giving duplicates. Is there way to dynamically increase exploration?

I started with using client API, then switched to designer for no particular reason. it seems to be a bit faster. I do need a custom loop though, because measurement will be async: I create a trial and in some point in the future - result arrives. For now I am testing it in sync setup, so it should not really affect for this question