yzcjtr / GeoNet

Code for GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose (CVPR 2018)
MIT License
723 stars 181 forks source link

Input of flownet #19

Closed SeokjuLee closed 6 years ago

SeokjuLee commented 6 years ago

Hi, thank you for publishing the GeoNet code. I have a minor question about the input of ResFlowNet. In the code, the input size of the flownet is (4*b) x h x w x 12, where 'b' stands for the batch size and 12 is the channel (tgt, src, wrp, flow, err - stacked) size. The number of batch size is multiplied by 4, because there are forward & backward directions I think. Is there a specific reason why you didn't concatenate only in the channel-wise direction? Isn't it enough with the number of 21 channels for the input (ie, b x h x w x 21) of flownet?

21channels: It, I{t-1}, I{t+1}, W{t-1}, W{t+1}, F{t,t-1}, F{t,t+1}, E{t-1}, E_{t+1} , where, I: images (3ch), W: warped images (3ch), F: flows (2ch), E: similarity errors (1ch)

yzcjtr commented 6 years ago

Hi, good point suggested! But in that case you would have to input 3 frames into the resflownet. It's not flexible if the seq_len is other than 3. We would like to adopt an architecture invariant to the seq_len setting.

SeokjuLee commented 6 years ago

I got it now. Thanks!