Closed dflemin3 closed 4 years ago
Specifically, I really need to avoid copying the full GP. What would likely be better is to pass everything the GP needs, e.g. theta, y, current hyperparameter vector, and instantiate the GP for each function call for each process. For even GP's with ~1000s of data points, the initialization, including the compute call, should be of order 1 second, see the george docs, so re-initializing a GP for each process should be cheaper than serializing the full GP object.
I've removed multiprocessing for now as it's current overhead is prohibitively slow and will require a substantial rewrite.
Currently, the multiprocess implementation for parallelizing GP optimizations and new design point selection is slow, presumably because spinning up new processes is expensive since it requires pickling the GP and sending it to each process, which can be expensive due to the GP's non-trivial structure and large-ish covariance matrix.
Potential fixes include schemes to share the data with each process using scheme like what is documented here.