JihyongOh / XVFI

[ICCV 2021, Oral 3%] Official repository of XVFI
285 stars 39 forks source link

some questions #5

Open KevenLee opened 3 years ago

KevenLee commented 3 years ago

Thanks for your wonderful jobs!

I have some questions:

  1. Have you compared it with RIFE?
  2. Does the method use explicit optical flow supervision?
  3. some bad cases "python main.py --gpu 0 --phase 'test_custom' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 2 --custom_path ./test_img_dir/test4" input image1 0111 input image2 0112 result 0111_000
rsjjdesj commented 3 years ago

The motion in the left hand is too big for an interpolation scheme to work. Have you tried RIFE or SoftSplat on these input images. I believe they will also have similar issues ?

KevenLee commented 3 years ago

The motion in the left hand is too big for an interpolation scheme to work. Have you tried RIFE or SoftSplat on these input images. I believe they will also have similar issues ?

Yes, RIFE and softSplat also have similar issues. Now I want to interpolate the ultra-low frame rate 1080P video with a frame rate of less than 10. These videos are characterized by very large movements. Do you have any good ideas?

rsjjdesj commented 3 years ago

The motion in the left hand is too big for an interpolation scheme to work. Have you tried RIFE or SoftSplat on these input images. I believe they will also have similar issues ?

Yes, RIFE and softSplat also have similar issues. Now I want to interpolate the ultra-low frame rate 1080P video with a frame rate of less than 10. These videos are characterized by very large movements. Do you have any good ideas?

I think you would have to train on such dataset and see how the inferencing goes. @sniklaus (SoftSplat author), @JihyongOh, @hzwer (RIFE author) can comment more on how to solve this.

sniklaus commented 3 years ago

The main challenge in video frame interpolation is dealing with large inter-frame motion (motion magnitude in terms of pixels). This motion can be challenging (read: have a large magnitude) for various reasons: high-res footage, fast motion, low frame rate as input, and others (note, just because the frame rate in the input is low doesn't mean much, if the motion is little as well then it is still perfectly doable to interpolate low frame rate footage). Having said this, you (@KevenLee) may just be reaching the limit of what current state-of-the-art video frame interpolation can do due to too large inter-frame motion.

sniklaus commented 3 years ago

Just for completeness, you can find the result of SoftSplat on the provided images below.

https://user-images.githubusercontent.com/1238034/121745086-c0a89900-cab8-11eb-966f-df47c1736820.mp4

JihyongOh commented 3 years ago

@KevenLee

  1. Unfortunately, we have not.
  2. No. Our XVFI-Net itself induce optical flows to warp input frames to intermediate time t without any optical flow supervision. Further details are described in the paper.
  3. We agree with what @rsjjdesj and @sniklaus said. The given input frames are very challenging for frame interpolation. Besides the motion magnitude, the complexity of the motion, blurriness and brightness (color) change in the frames make the task challenging. The state-of-the-art methods are still trying to bridging the gap between experiments in the lab and real-world applications. Many VFI methods are based on the estimation of optical flow. The given input frames seem to be difficult to obtain accurate optical flow as well. But you can first search the robust optical flow studies for a real-world application, like your challenging input frames. Nevertheless, these kinds of discussions will motivate us for further challenging studies. Thank you.