Closed pabloprf closed 2 years ago
Why do you need multiprocessing? BoTorch / GPyTorch already utilizes parallel processing via MKL or GPU libraries. My understanding is that in general PyTorch does not play well with multiprocessing. if you are trying to do many function evaluations in parallel you may want to take a look at the CMA-ES tutorial, if you haven’t already. https://botorch.org/tutorials/optimize_with_cmaes
e
Sent from my iPhone
On Dec 23, 2019, at 2:56 PM, Pablo Rodriguez-Fernandez notifications@github.com wrote:
I have implemented a genetic algorithm that calls forward evaluations of a model that I have fitted previously (in particular, a FixedNoiseGP). However, I am having problems parallelizing it with multiprocessing.
I have tried to use multiprocessing.Pool(), but then I get an error with pickling certain torch functions: Can't pickle
: import of module 'torch._C._nn' failed I have also tried with multiprocessing.Process(), but it hangs up in the forward evaluation of the FixedNoiseGP. Interestingly, if I write a print command inside the forward method of FixedNoiseGP, I can see that the MultivariateNormal is indeed being evaluated, but for some reason not passed to the process.
Any idea of how to solve this? Or other options to use a botorch model inside a parallel framework?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Thanks for the reply. My algorithm does indeed send a batch of evaluations that take advantage of the parallel processing that BoTorch has. However, what I am looking for is to run several optimization algorithms in parallel. For example, several independent optimizations with CMA-ES on the same model. This is useful in practice when you have a heuristic method like genetic algorithms and you need to run several of them changing parameters.
I provide here an example of my problem. If I use the same scripts as on the BoTorch tutorial (https://botorch.org/tutorials/fit_model_with_torch_optimizer):
import math
import torch
import numpy as np
# use a GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
dtype = torch.float
# use regular spaced points on the interval [0, 1]
train_X = torch.linspace(0, 1, 15, dtype=dtype, device=device)
# training data needs to be explicitly multi-dimensional
train_X = train_X.unsqueeze(1)
# sample observed values and add some synthetic noise
train_Y = torch.sin(train_X * (2 * math.pi)) + 0.15 * torch.randn_like(train_X)
from botorch.models import SingleTaskGP
from gpytorch.constraints import GreaterThan
model = SingleTaskGP(train_X=train_X, train_Y=train_Y)
model.likelihood.noise_covar.register_constraint("raw_noise", GreaterThan(1e-5))
from gpytorch.mlls import ExactMarginalLogLikelihood
mll = ExactMarginalLogLikelihood(likelihood=model.likelihood, model=model)
# set mll and all submodules to the specified dtype and device
mll = mll.to(train_X)
from torch.optim import SGD
optimizer = SGD([{'params': model.parameters()}], lr=0.1)
NUM_EPOCHS = 150
model.train()
for epoch in range(NUM_EPOCHS):
# clear gradients
optimizer.zero_grad()
# forward pass through the model to obtain the output MultivariateNormal
output = model(train_X)
# Compute negative marginal log likelihood
loss = - mll(output, model.train_targets)
# back prop gradients
loss.backward()
# print every 10 iterations
if (epoch + 1) % 10 == 0:
print(
f"Epoch {epoch+1:>3}/{NUM_EPOCHS} - Loss: {loss.item():>4.3f} "
f"lengthscale: {model.covar_module.base_kernel.lengthscale.item():>4.3f} "
f"noise: {model.likelihood.noise.item():>4.3f}"
)
optimizer.step()
# set model (and likelihood)
model.eval();
I can do quick evaluations like:
x = torch.from_numpy(np.expand_dims([0.5], axis=1)).float()
print(model(x))
which gives me MultivariateNormal(loc: tensor([0.0027], grad_fn=<ViewBackward>))
However, to parallelize evaluations like that one, this does not work:
import torch.multiprocessing as multiprocessing
def funcParallel(x):
print(x)
x = torch.from_numpy(np.expand_dims([x], axis=1)).float()
print(model(x))
processes = []
X = [[0.0],[0.5]]
for i,x in enumerate(X):
p = multiprocessing.Process(target=funcParallel,args=(x,))
p.start()
processes.append(p)
for p in processes: p.join()
because it just waits forever to evaluate model(x)
Hopefully you can reproduce the same behavior. This is just a quick example. This could be solved in this case by just sending all the points together with x = torch.from_numpy(np.expand_dims([X], axis=1)).float(); print(model(x))
. However, I cannot do that in the real case, because the evaluations belong to different optimization workflows.
I haven't done much multiprocessing with torch, but I assume the same gotchas (and more) as in regular python apply, which is probably what is happening here. I usually use the Pool
approach when doing multiprocessing in python, so I'd try to figure out the pickling error as a first step. This may be related to https://github.com/cornellius-gp/gpytorch/issues/907
Closing this since it has been inactive for 2+ years
I have implemented a genetic algorithm that calls forward evaluations of a model that I have fitted previously (in particular, a
FixedNoiseGP
). However, I am having problems parallelizing it withmultiprocessing
.I have tried to use
multiprocessing.Pool()
, but then I get an error with pickling certain torch functions:Can't pickle <built-in function softflus>: import of module 'torch._C._nn' failed
I have also tried with
multiprocessing.Process()
, but it hangs up in the forward evaluation of theFixedNoiseGP
. Interestingly, if I write aprint
command inside theforward
method ofFixedNoiseGP
, I can see that theMultivariateNormal
is indeed being evaluated, but for some reason not passed to the process.Any idea of how to solve this? Or other options to use a botorch model inside a parallel framework?