Closed yubobao27 closed 1 year ago
@yubobao27 Thanks for your suggestion. I'd be open to a PR but unfortunately I don't have the time to design/implement this myself right now.
Sure. How would one go about it if it can be done? If you describe a way maybe we can contribute.
Hi @yubobao27 - I think the best place to implement this is actually on the user side. Users can build GPU parallelism into their objective function in different ways.
For example, let's say your objective is to minimize 2-norm in the feature space of a neural net, and your input x
is a minibatch:
import torch
import torch.nn as nn
from torchmin import minimize
net = nn.Sequential(
nn.Linear(200, 512),
nn.Tanh(),
nn.Linear(512, 512)
).cuda()
x0 = torch.randn(80, 200).cuda()
def obj(x):
y = net(x)
return y.norm(dim=1).mean()
result = minimize(obj, x0, method='bfgs')
To parallelize computations across the minibatch, users can make a simple modification:
def obj(x):
y = nn.parallel.data_parallel(net, x)
return y.norm(dim=1).mean()
If you have other thoughts, or other use cases in mind, I'd be curious to hear.
Can multiple gpus be utilized? Since input or model cannot be parallelized? Running trust_ncg on a model with mult-million rows input exceeds single GPU memory.