tarun005 / FLAVR

Code for FLAVR: A fast and efficient frame interpolation technique.
Apache License 2.0
455 stars 69 forks source link

Question on PSNR evaluation on 8x and 4x (Table. 2 and Table. 3) #13

Closed mrchizx closed 3 years ago

mrchizx commented 3 years ago

Hi,

I have a question about the evaluation on the 8x and 4x cases for Table.2 and Table. 3 regarding the Adobe dataset in the paper. It seems 4x cases has much higher PSNR compared to 8x cases.

Let's say the 7 intermediate frames are denoted as t1, t2, t3, t4, t5, t6, t7. To my understanding the PSNR values are normally: (t1 close to t7) > (t2 close to t6) > (t3 close to t5) > t4 At lease, this is what I have observed for DAIN, SuperSloMo and QVI. And it is expected that when the temporal distance to the input frame increases, the interpolated quality decreases (lower PSNR).

For 4x, you would only have t2, t4, t6, so the average PSNR values should be expected to be lower than 8x.

However, for 4x in Table.3 FLAVR is 5.62dB higher compare to 8x in Table.2. And other methods (DAIN, QVI and SuperSloMo) all experienced much higher PSNR. To my understanding 5.62dB is a huge increase.

The expected trend should be similar to Table.3 in BMBC paper: https://arxiv.org/pdf/2007.12622.pdf where PSNR(2x) < PSNR(4x) < PSNR(8x).

I am wondering if there is anything I missed for the evaluation that causes my confusion?

Thanks

tarun005 commented 3 years ago

Our evaluation and sampling pattern is different, and explained in detail in our paper.

For 8x interpolation, we sample 25 consecutive frame from the video (F1 - F25). Inputs are (F1,F9,F17,F25) and ground truth intermediate frames are (F10-F16). In our GoPro dataset, this amounts to converting 30FPS videos to 240FPS.

Similarly, for 4x interpolation, we sample 13 consecutive frame from any video (F1 - F13). Inputs are (F1,F5,F9,F13) and ground truth intermediate frames are (F6-F8). In our GoPro dataset, this amounts to 60FPS - 240FPS. Clearly, (60->240) is easier than (30->240), and hence the higher PSNR.

mrchizx commented 3 years ago

Oh, I thought 4x is interpolating from 30fps to 120fps.

In the case of 60fps->240fps, it's easier.

Thanks for answering.