Closed sjohnson-FLL closed 1 year ago
Update: After further investigation, I now believe my problem was caused by the dtol
setting in EIStrategy()
. The default value of dtol is 10^-3*norm(ub-lb)
. It makes sense that generating new points starts to slow down when the requirement for proximity to previous points is overly strict. I solved my problem by changing to dtol=5*10^-5*norm(ub-lb)
. This solved my problem for at least 7000 points. I imagine if it starts to slow down again I can decrease the scaling on dtol
again.
Closing the issue, but wanted to leave this comment here in case anyone else has similar troubles.
I am using pySOT with EIStrategy and GPRegressor surrogate model for a 6-dimensional optimization problem (all axes are continuous). I am using 10 worker threads. I have Python 3.11.1 and Windows 10.
In recent runs, I have observed that after a little more than 1000 data points, generating new points starts to slow WAY down. Before this occurs, most time is spent evaluating the objective function, with worker threads being assigned a new task within seconds (or ms even) of finishing the previous evaluation. However, at a little over 1000 data points, I noticed that almost all the time is spent waiting for new assignments.
This is confirmed by computing resource allocation. Prior to 1k, the evaluations called by worker threads take up all available CPU. After 1k, Python takes ~30% and worker threads rarely take any at all. In fact, only one worker thread is ever actually evaluating the objective at once because it gets done evaluating the objective before the next point has been generated.
This leads me to suspect that the process of generating new points slows WAY down around 1k points.
Edit: this SE article seems relevant. https://stats.stackexchange.com/questions/326446/gaussian-process-regression-for-large-datasets