tarun005 / FLAVR

Code for FLAVR: A fast and efficient frame interpolation technique.
Apache License 2.0
428 stars 69 forks source link

UCF101 testing dataset #18

Closed tkkcc closed 3 years ago

tkkcc commented 3 years ago

Hi, I found UCF101 original dataset with avi format and UCF101 triplet dataset with png format. But there is no 5-frames dataset availble. Can you provide the method to generate the UCF101 testing dataset for FLAVR.

tarun005 commented 3 years ago

Hi, you can download it from here. https://sites.google.com/view/xiangyuxu/qvi_nips19

tkkcc commented 3 years ago

Thanks! So, for 2-frames input method, you also use this UCF101 extracted by QVI instead of DVF?

tarun005 commented 3 years ago

Yes.

tkkcc commented 3 years ago

Hi, I got 31.38dB on UCF101(provided by QVI) using pretrained 2x interpolation weights, which is not comparable to other methods. All results are reproduced using public model weights.

    method            venue       input_frame_num  TrainingDataset    Vimeo90K(PSNR/SSIM)  UCF101(PSNR/SSIM)  MiddleBury(IE)
    QVI               NIPS2019    4                REDS               34.72/0.954          32.61/0.949        2.680
    EQVI              ECCV2020    4                REDS               34.05/0.949          32.08/0.942        2.750
    SAVFI_sepconv     CVPR2020    4                Vimeo90K           34.18/0.949          32.50/0.949        2.160
    GDConv            TMM2021     4                Vimeo90K           35.59/0.957          33.11/0.950        2.080
    FLAVR             CVPR2021    4                Vimeo90K           36.31/0.961          31.38/0.937        2.460

Would you like to share your testing script on UCF101?

tarun005 commented 3 years ago

For UCF-101, we used a center crop of size 224x224 on all the 100 quadruplets. Rest all parameters should be the same. Were you able to reproduce results on Vimeo-90K ? Can you share what versions of software you are using?

tkkcc commented 3 years ago

You can see in the table, 36.31 on Vimeo90k matches paper result. 32.61 on UCF101 for QVI matches paper result too. Only 31.38 is abnormal. I am using pytorch1.1.0. and evaluating on 225x 225 resolution.

tarun005 commented 3 years ago

Added the evaluation script for UCF-101. Let me know if you still can't reproduce. Try to use dependencies as recommended. PyTorch>=1.4.0, torchvision>=0.5.0, cudatoolkit==10.1

tkkcc commented 3 years ago

Thanks, I will try tomorrow.

tkkcc commented 3 years ago

I can got 33.33dB on UCF101 now, still using pytorch 1.1.0. The only key is the 224x224 crop. Before I use full resolution(225x225) and adopt the same strategy on Middlebury in your code, that is resize to multiplier of 8. Still confused why this can lead to 2dB degradation. (o )ノ

Thank you for sharing your code and this great work!

tarun005 commented 3 years ago

@tkkcc Hi, May I ask what procedure you followed to report FLAVR numbers on Middleburry dataset? Which split did you use? The results seem different from the one we include in our report.

tkkcc commented 3 years ago

https://vision.middlebury.edu/flow/data/

I use all the 10 samples from other-color-allframes.zip that contain at least 4 frames(frame09.png~frame12.png) as input, and other-gt-interp.zip as groundtruth.