gdlg / pytorch_compact_bilinear_pooling

Compact Bilinear Pooling for PyTorch
BSD 3-Clause "New" or "Revised" License
252 stars 43 forks source link

Multi GPU support #4

Closed YanWang2014 closed 6 years ago

YanWang2014 commented 6 years ago

I modify

class CompactBilinearPooling(nn.Module):   
     def forward(self, x, y):    
            return CompactBilinearPoolingFn.apply(self.sketch1.h, self.sketch1.s, self.sketch2.h, self.sketch2.s, self.output_size, x, y)

to

def forward(self, x):    
    x = x.permute(0, 2, 3, 1) #NCHW to NHWC   
    y = Variable(x.data.clone())    
    out = (CompactBilinearPoolingFn.apply(self.sketch1.h, self.sketch1.s, self.sketch2.h, self.sketch2.s, self.output_size, x, y)).permute(0,3,1,2) #to NCHW    
    out = nn.functional.adaptive_avg_pool2d(out, 1) # N,C,1,1   
    #add an element-wise signed square root layer and an instance-wise l2 normalization    
    out = (torch.sqrt(nn.functional.relu(out)) - torch.sqrt(nn.functional.relu(-out)))/torch.norm(out,2,1,True)   
    return out 

This makes the compact pooling layer can be plugged to PyTorch CNNs more easily:

model.avgpool = CompactBilinearPooling(input_C, input_C, bilinear['dim'])
model.fc = nn.Linear(int(model.fc.in_features/input_C*bilinear['dim']), num_classes)

However, when I run this using multiple GPUs, I got the following error:

Traceback (most recent call last): File "train3_bilinear_pooling.py", line 400, in run() File "train3_bilinear_pooling.py", line 219, in run train(train_loader, model, criterion, optimizer, epoch) File "train3_bilinear_pooling.py", line 326, in train return _each_epoch('train', train_loader, model, criterion, optimizer, epoch) File "train3_bilinear_pooling.py", line 270, in _each_epoch output = model(input_var) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 319, in call result = self.forward(*input, **kwargs) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 67, in forward replicas = self.replicate(self.module, self.device_ids[:len(inputs)]) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 72, in replicate return replicate(module, device_ids) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/nn/parallel/replicate.py", line 19, in replicate buffer_copies = comm.broadcast_coalesced(buffers, devices) File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/cuda/comm.py", line 55, in broadcast_coalesced for chunk in _take_tensors(tensors, buffer_size): File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/_utils.py", line 232, in _take_tensors if tensor.is_sparse: File "/home/member/fuwang/opt/anaconda/lib/python3.6/site-packages/torch/autograd/variable.py", line 68, in getattr return object.getattribute(self, name) AttributeError: 'Variable' object has no attribute 'is_sparse'

Do you have any ideas?

YanWang2014 commented 6 years ago

add: it runs normally when use one GPU

gdlg commented 6 years ago

Thank you very much for reporting this bug.

I have fixed the multi-GPU support. It was a small conflict between DataParallel and h,s parameters which were stored as Variable buffers.

I have changed the prototype of forward to make y an optional argument.

I can also add a dim option to the constructor however we can’t get rid of the permutation because the FFT requires the channels to be the last dimension of a contiguous tensor.

I am not too keen to add the activation and normalization directly inside the cbp layer because contrary to TensorFlow, none of the layers in torch.nn works that way. I think that it is cleaner to put them in a separate module and plug everything using nn.Sequential.

YanWang2014 commented 6 years ago

Great! Thank you!