oeway / pytorch-deform-conv

PyTorch implementation of Deformable Convolution
MIT License
911 stars 151 forks source link

indexing with a detached variable #1

Closed ncullen93 closed 6 years ago

ncullen93 commented 7 years ago

Hi oeway,

Any chance you can help me understand your code? On this line, you index the input with a detached variable, so I'm wondering how you propagate the gradient backward through the vals_lt, etc.. It seems like mapped_vals would not have any parent nodes with gradients? Does that make sense? When I try to do a similar thing here for a spatial transformer network, it gives me a no nodes require gradients error.

Do you get around this by freezing the entire network? I feel like you would get the same error if the network wasn't frozen. Any insight you can provide into this would be appreciated.

EDIT: Ok, I get that the gradient propagates through the coords_offset_lt value... can you describe where you got this interpolation algorithm from? Thanks :)

oeway commented 7 years ago

Hi, as to my understanding, the gradient is come from the bilinear operator: mapped_vals --> coords_offset_lt --> coords_lt and coords. You can find the following sentence:

The gradients enforced on the deformable convolution module can be back-propagated through the bilinear operations in Eq. (3) and Eq. (4).

Actually, you can't get the gradient w.r.t indices for select_index in pytorch.