NVlabs / PWC-Net

PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume, CVPR 2018 (Oral)
Other
1.63k stars 355 forks source link

Thanks for sharing and some little doubts #1

Closed RanhaoKang closed 6 years ago

RanhaoKang commented 6 years ago

Hi all,

Thanks for sharing your code! I've been trying to understand your work for a couple of weeks, and just read your released code. Could you please answer my doubts as following? I'm not familiar with Caffe so I didn't read your Caffe code yet. Sorry for if there were some details having been clearified in Caffe version.

  1. Why you divide groundtruth by 20. It seems that dividing GT level by level by its scaling factor is making more sense. Dividing by 20 makes the problem more like regression instead of geometry or principle, doesn't it?

  2. What's the difference between PWC-Net and PWC-Net_ROB?

deqings commented 6 years ago

Hi RanhaoKang,

Thank you for your interest in our work.

1. Why you divide groundtruth by 20. We followed the practice of FlowNet to divide the GT as the supervision signal, so that we could re-use their learning rate schedules instead of searching for new ones. Note that the optical flow used for warping at each level has been re-scaled to match the spatial resolution of that particular pyramidal level.

2.What's the difference between PWC-Net and PWC-Net_ROB? There are two major differences. First, PWC-Net_ROB is the PWC-Net with a larger feature pyramid extractor (PWC-Net-feature-uparrow, second row in Table5(a) of Our CVPR 2018 paper below). Second, the model parameters have been fine-tuned using the Sintel, KITTI, and HD1K data.

Best regards, Deqing