pytorch / contrib

Implementations of ideas from recent papers
391 stars 42 forks source link

swa, type mismatch #19

Closed ddeeppnneett closed 5 years ago

ddeeppnneett commented 5 years ago

I use swa.py in this way. Is this the proper way?

optimizer0 = optim.SGD(model.parameters(), 1e-4,momentum=0.9, weight_decay=weight_decay)
optimizer  =  SWA(optimizer0)

for epoch in range(num_epoch):
    train(train_loader, model, criterion, optimizer, epoch)
    ###########
    optimizer.swap_swa_sgd()
    avg_loss , avg_acc= validate(val_loader, model , criterion)
    optimizer.swap_swa_sgd()
    ###########
    if epoch == 0  : 
        optimizer.bn_update(train_loader, model, device='cuda')     
    if get_learning_rate(optimizer) <  5e-7 or is_lowest_loss :
        if epoch < 8:
            torch.save(state, './model/checkpoint' %file_name +'_%s.pth.tar' %epoch )
        else :
            optimizer.bn_update(train_loader, model, device='cuda')
            optimizer.update_swa()
            optimizer.swap_swa_sgd()
            torch.save(state, './model/checkpoint' %file_name +'_%s.pth.tar' %epoch )
            optimizer.swap_swa_sgd()

And for def bn_update(),

   swa.py", line 302
    model(input)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ts/code05_bin_shuff/snetv27.py", line 118, in forward
    x = self.conv1(x)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/opt/anaconda3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 320, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same

I do not use pytorch for long time. It is the first time for me to see torch.cuda.DoubleTensor.
model(input.type(torch.cuda.FloatTensor)) could fix this error. But how does it happen?

PS: There are lots of warnings in model.eval() as

SWA wasn't applied to param {}; skipping it".format(p)

Should I fix it, or not?

soumith commented 5 years ago

the problem is not wrt SWA. I think your input itself is a Double, probably? Maybe you have converted a Tensor from a numpy array with dtype float64