Open Shakesbeery opened 4 months ago
Hi Shakesbeery,
unfortunately I was not able to reproduce your error.
Originally, "fails" was introduced as previously tried configurations might be sampled again, but you don't want to return them and instead sample a new one. In our experiments the retries are mostly low, but with longer runs it is possible to get a higher amount of retries.
Also, in your example the initial design is still called. You can surpass it by passing initial_design = optimizer.get_initial_design(scenario=scenario, n_configs=0)
to the facade. Does this maybe help?
Description
NOTE: This may be related to #1088 and #1086, but if so the triggering mechanism is slightly different. Fixing one may fix the others though, or at least provide a temporary patch.
In various use cases I have a lot of prior experimentation data that can be used to warm start the BO process. I do this by starting with a series of
optimizer.tell()
calls and provide the configurations+outcomes of the previous experiments. Once the initial configuration probing is done, I resume an ask-and-tell workflow. Unfortunately, when attempting to calloptimizer.ask()
, the following error occurs:Steps/Code to Reproduce
The following is a bit contrived because I can't share my exact code, but it still reproduces the same error.
Expected Results
The expectation is that when calling
optimizer.ask()
it provides a trial info object such as:Actual Results
We get the abovementioned error (with a little more context):
Solutions?
The root of the problem I think has to do with the internal mechanics of the
tell()
function. It's designed to be used in the typical ask-and-tell workflow and wasn't designed for initial configuration space probing. As such,smbo.tell()
adds to therunhistory
but they've never been intensified or been part of a running trial. This causes theintensifier._queue
size to suddenly balloon in theintensifier.__iter__
function here:Then the intensifier enters the loop:
Here's where the problem starts.
fails
begins to be incremented inside of the while loop and we always enter theelse
portion of the iterator (line 236 in my version) to find a challenger:Because the loop always
break
s at line 296, we start back at the top of the while loop andfails += 1
. At some point, arbitrarily,intensifier._retries
was set to16
. This means that becausefails=-1
, we have 17 chances to probe the configuration space before we accidentally trigger the retry failure. This is shown by changing my sample codenp.random.randint(-10, 11, (18, 2))
->np.random.randint(-10, 11, (17, 2))
and it now works without issue.Since it doesn't make sense to truncate previous knowledge before optimizing, I've bypassed this problem by setting
optimizer.intensifier._retries = len(payload["previous_trials"]) + 1
before I calloptimizer.ask()
. This allows theintensifier
to run its loop without triggering the retry failure.I can also use
optimizer.intensifier.reset()
after the series ofoptimizer.tell()
calls, but I don't know what else that may affect, so I avoid it.There is no way currently to pass
retries
to theoptimizer.get_intensifier()
function call which complicates dynamically setting that limit at configuration time. Exposing that at a higher level might at least alleviate the issue for some users.Probably the more durable solution is to be more specific about what a failure is in the intensification loop. Right now every iteration is considered a failure, but that's not necessarily true. What was the original intent of checking
if fails > self._retries
and how do we avoid that colliding with configuration space probing prior to optimization?Versions
'2.0.1'