Closed Game4Move78 closed 2 years ago
The ask and tell functions together form one interface, they are not alternatives to each other. Points are given by ask
and the user reports them (with the corresponding function values) with tell
. For the one-shot methods, tell
is not very important, because the user could just remember the best points for themselves. No feedback is required. The low discrepancy sequences are provided by ask.
Thanks for the answer. There are two alternatives I was thinking of, and (1) seems to be current implementation based on what you're saying since tell
is ignored by one-shot methods.
ask
and points reported with tell
are ignored (although they are used for suggested points).ask
and the sequence is aware of points reported with tell
. For instance if a point is asked but no feedback is provided, a point in that region will be asked again so it doesn't end up with lower density in the space of points where the target objective has been evaluated.For instance if my use case is machine learning I could combine a one shot optimiser with the vanilla hyperband algorithm. The optimiser will simply be used to get configurations for each bracket of hyperband. If it turns out that a bunch of configurations are stopped at low fidelity (e.g. 1 epoch) but it turns out that a configuration with good terminal validation loss is among those, we would hope those configurations could still be sampled again for other brackets. Is there currently a way to implement (2) so that points given by ask
that aren't used to provide feedback with tell
are repeated in the sequence? (Even if not directly supported by the API)
I don't totally understand. Hyperband-type optimization isn't a direct fit for nevergrad. I think you could implement your own optimizer in nevergrad which works something like you are suggesting. A custom optimizer just needs to implement ask and tell, and you could source values to ask from an existing one-shot optimizer.
Thanks for your help.
Do the one shot nevergrad.optimizers.SamplingSearch methods provide more evenly distributed, low-discrepancy sequences in the points given by the ask interface or those provided to the tell interface? In the latter case regions of low density in the told points would be asked repeatedly until feedback is provided with a tell.
I was hoping to have some kind of uniformity in the told points even if they need to be asked many times before feedback is given to the optimiser by a tell.