About the learning rate setting of p_conv and m_conv

4uiiurz1 / pytorch-deform-conv-v2

PyTorch implementation of Deformable ConvNets v2 (Modulated Deformable Convolution)

MIT License

739 stars 141 forks source link

About the learning rate setting of p_conv and m_conv #7

Open dontLoveBugs opened 5 years ago

dontLoveBugs commented 5 years ago

You set the gradient of p_conv and m_conv to 0.1 times the other layers, but I find the gradient has no change after backward. I use the following code to test.

    def _set_lr(module, grad_input, grad_output):
        print('grad input:', grad_input)
        print('grad output:', grad_output)
        grad_input = (grad_input[i] * 0.1 for i in range(len(grad_input)))
        grad_output = (grad_output[i] * 0.1 for i in range(len(grad_output)))

    x = torch.randn(4, 3, 5, 5)
    y_ = torch.randn(4, 1, 5, 5)
    loss = nn.L1Loss()

    d_conv = DeformConv2d(inc=3, outc=1, modulation=True)

    y = d_conv.forward(x)
    l = loss(y, y_)
    l.backward()

    print('p conv grad:')
    print(d_conv.p_conv.weight.grad)
    print('m conv grad:')
    print(d_conv.m_conv.weight.grad)
    print('conv grad:')
    print(d_conv.conv.weight.grad)

The gradient of p_conv is same with the grad_input, but I think the gradient of p_conv is 0.1 times the gradient of the grad_input. Am I wrong?

4uiiurz1 commented 5 years ago

You're right! I'll fix it.

BananaLv26 commented 5 years ago

You're right! I'll fix it.

Have you solved this problem now?

jszgz commented 4 years ago

@dontLoveBugs Hello, can you review my issue ? I think the bilinear kernel is wrong

zcong17huang commented 3 years ago

You're right! I'll fix it.

'tuple' object can not be modified. Your code just get an generator.

XinZhangRadar commented 3 years ago

I have searched online, the grad of output can not be modified, if you want modify the grad of input, you need to return the modified grad of input , like : def _set_lr(module, grad_input, grad_output): return (grad_input[i] * 0.1 for i in range(len(grad_input)))

you can try it. My question is ： Why change the p_conv gradients, Is it to avoid affecting the learning of another feature extraction branch?

steven22tom commented 3 years ago

@XinZhangNLPR the you is becuse the backforward_hook expected tuple, not 'generator'

I have searched online, the grad of output can not be modified, if you want modify the grad of input, you need to return the modified grad of input , like : def _set_lr(module, grad_input, grad_output): return (grad_input[i] * 0.1 for i in range(len(grad_input)))

you can try it. My question is ： Why change the p_conv gradients, Is it to avoid affecting the learning of another feature extraction branch?

Your suggestion still return a generator not a tuple

YXB-NKU commented 11 months ago

You're right! I'll fix it.

it seems this bug has not fixed yet