Closed brandon-holt closed 6 months ago
Just as a temp local fix I am adding a catch-all except in the DOE loop and breaking to return the incomplete results.
while k_iteration < limit:
# Get the next recommendations and corresponding measurements
try:
measured = campaign.recommend(batch_size=batch_size)
except:
break
Hi @brandon-holt 👋🏼 One part that can be easily answered: your suggestion of providing access to partial simulation results sounds absolutely reasonable and I think I can confidently say that we'll incorporate some appropriate mechanism into the refactored module (so far, we simply haven't had the need for it because the simulations always succeeded). The small challenge is see here is that a clean handling requires more than just returning the incomplete dataframe (your workaround) or passing through the exception (current logic) because:
Also, the mechanism needs to be compatible with all simulation layers we offer (i.e. simulating a single campaign vs simulating multiple campaigns, etc). However, I think I already have some good ideas how this can be accomplished.
That said, I've nothing against providing a quick workaround to unblock you, as long as the changes do not cause backward compatibility issues later. Let me draft a quick PR and see what my colleagues think about it 👍🏼 will tag you there.
Now, the more worrisome part. So far, I haven't experience any of the problems you describe. While I can offer trying to debug/investigate the botorch internals if we can come up with a minimal reproducing example, I would only do it as a last resort and first see if we get a better understanding of what's going on.
So here a few things we should consider first:
allow_recommending_already_measured
to False
, which all our "pure" recommenders support. That way, we know for certain that no duplicates can appear in the training data throughout the simulation.comp_df
of your searchspace to see if there is anything suspicious ...Heyo! These are good points, I will look into them and let you know what I find!!
@AdrianSosic Okay so after some quick and dirty initial testing, it looks like allow_recommending_already_measured
actually causes it to fail sooner, which is potentially interesting! But I am running more extensive tests to make meaningful comparisons, which should be done in a day or so and I'll lyk what I find.
In the meantime, I'm looking into the features in my comp_df for each parameter in my search space to see if any features are highly correlated. Attaching here if you're curious!
You mean it fails sooner if you set the attribute to False
? That's indeed a bit surprising. Curios what's going on here... 🤔 I'll let you first finish your tests and we can have a look 👍🏼
@AdrianSosic Yes, so it definitely appears that setting allow_recommending_already_measured=False causes the model to reach a failure point sooner. In my tests with my dataset that would take ~1000 iterations to test every datapoint, the model fails in: |
Setting | Result |
---|---|---|
allow_recommending_already_measured=True |
>200 iterations | |
allow_recommending_already_measured=False |
>100 iterations |
This could make sense because when the model isn't allowed to pick 'repeated' measurements, it is more likely to reach the model-breaking outliers/datapoints faster. However, our hypothesis was that the 'repeated' measurements that have the same features but disparate target values were in fact the ones that were breaking the model.
@AdrianSosic Hey just adding a repro for you in case it helps. Just a heads up running as is will take ~200-300 GB of RAM. If that's problematic, you could bump percent_discretize
up to 20-50 to reduce the amount of memory required. Just note that I think there may be a point above which the issue disappears, so keep that in mind if you go to higher discretization factors.
The only concern is that by replacing my molecules SMILES with random ones, it may change what solves the issue, but after running some initial tests on my end, it looks like the behavior is similar (as the results shown in the table above).
Hi @brandon-holt, thanks for sharing. I would like to have a look but need to postpone this until next week. Currently, we are a bit swamped with open PRs and features to be merged + need to release 0.9.0 asap, which will keep me busy for a while. Will let you know once I've had a chance to look 🙃
@AdrianSosic No worries, thanks for the heads up!
Hi! I was wondering if it would be possible to add a feature where simulations will still return the results compiled up to the point of an error?
The situation I'm running into when running on larger datasets is a botorch error to the tune of
All attempts to fit the model have failed.
I am in the process of troubleshooting what about the dataset is causing the failure, but in the meantime it would be nice to see the results up to that point, which should include dozens of batches of experiments.
Also, if you have any experience with what might be causing an error like this, that would be helpful!
Referring to this comment in a botorch thread: https://github.com/pytorch/botorch/issues/1226#issuecomment-1213539656 I initially wondered if this could be my issue, but baybe should prevent this from being an issue since it identifies duplicate parameter values and randomly picks one.