ValueError: optimizing a parameter that doesn't require gradients

cltdevelop commented 6 years ago

Hello, Thank you for sharing your code! When I run the RefineNet model as follows:

import os os.environ["CUDA_VISIBLE_DEVICES"] = '0'

net = RefineNet4Cascade((3, 224), num_classes=24)
optimizer = optim.Adam(net.parameters(), lr=1e-4, weight_decay=1e-5)  # lr=1e-4

x_var = Variable(torch.randn(1, 3, 224, 224))
y = net(x_var)
print y.size()
print y

I have encountered a bug like this: Traceback (most recent call last): File "/home/yzl/cltdevelop/tian_landmark/V2/round2_landmark_v3/models/refinenet_4cascade.py", line 186, in optimizer = optim.Adam(net.parameters(), lr=1e-4, weight_decay=1e-5) # lr=1e-4 File "/home/yzl/anaconda2/lib/python2.7/site-packages/torch/optim/adam.py", line 28, in init super(Adam, self).init(params, defaults) File "/home/yzl/anaconda2/lib/python2.7/site-packages/torch/optim/optimizer.py", line 61, in init raise ValueError("optimizing a parameter that doesn't " ValueError: optimizing a parameter that doesn't require gradients

can you give me some advice? Thank you!

thomasjpfan commented 6 years ago

That error happens because the resnet backbone has frozen parameters, i.e. they don't change during training. I just updated this package to use pytorch 0.4 and made the parameters method return just the parameters that require gradients. This code snippet should work now:

net = RefineNet4Cascade((3, 224), num_classes=10)
opt = optim.Adam(net.parameters())

x = torch.randn(1, 3, 224, 224)
y = net(x)
...

In pytorch 0.4, the Variableobject is not needed anymore.

pravn commented 6 years ago

I wonder if we could free params before using them to get around.

for p in module.parameters():
    p.requires_grad = True

thomasjpfan commented 6 years ago

In the __init__ method of RefineNet4Cascade, the Resnet layer is always frozen. As a fix, I override named_parameters which enables passing net.parameters() directly to an optimizer.

If you want to unfreeze the Resnet layer, then the following is needed:

for p in module.parameters():
    p.requires_grad = True

Another method was not to freeze the Resnet layer at all and allow the user to freeze the Resnet layers themselves.

thomasjpfan / pytorch_refinenet

ValueError: optimizing a parameter that doesn't require gradients #3