baowenbo / DAIN

Depth-Aware Video Frame Interpolation (CVPR 2019)
https://sites.google.com/view/wenbobao/dain
MIT License
8.22k stars 841 forks source link

Questions about the input for frame synthesis network #34

Closed asheroin closed 4 years ago

asheroin commented 4 years ago

As described in figure 3 from the CVPR paper, the input of frame synthesis network consists of five components, including raw interpolation kernels, projected flows, warped depth maps, warped frames and warped context features. However, in line 177 to 181 from DIAN_slowmotion.py , the input for rectifyNet seems not as same as described:

rectify_input = torch.cat((cur_output_temp,ref0,ref2, cur_offset_output[0],cur_offset_output[1], cur_filter_output[0],cur_filter_output[1], ctx0,ctx2 ),dim =1)

It seems that the actual input for the frame synthesis network did not include the warped depth maps while used a blended result from warped frames alternatively.

So which one should be the correct way for the proposed method? Would you pleased give a numerical analysis for such different settings?

laomao0 commented 4 years ago

https://github.com/baowenbo/DAIN/blob/master/networks/DAIN.py Line 133-138 The context features include the depth information.