NVlabs / PWC-Net

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume, CVPR 2018 (Oral)
Other
1.63k stars 357 forks source link

optical flow feature and optical flow upsampling #50

Closed xiayizhan2017 closed 5 years ago

xiayizhan2017 commented 5 years ago

This is a very elegant work.I have some doubts,and I hope to get help. 1.The highest scale features used in the program are 1/4, why not use the original scale features? 2.Why does the optical flow upsampling not use linear interpolation, but use nn.ConvTranspose2d, through the layer with variables, the optical flow will not find changes?

Thanks!

jrenzhile commented 5 years ago

@xiayizhan2017 for 1: I think otherwise the number of parameters in the network will become unnecessarily huge, thus harder to train. for 2: I do not fully understand the question, but nn.ConvTranspose2d() is a standard way of doing de-convolution in an encoder-decoder network.

sniklaus commented 5 years ago

As for 2, different types of upsampling can be implemented using transposed convolutions, one just needs to initialize the kernel appropriately and fix its weights. For a possible kernel that performs bilinear upsampling, you can have a look at: http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1BilinearFiller.html

deqings commented 5 years ago

For 2, we tested both and found that using fixed bilinear upsampling, the training converges faster in the beginning. However, with enough iterations, ConvTranspose2d catches up and is slightly better. My guess is that making the weights learnable results in a slightly larger-capacity model.