neu-vi / SportsSloMo

SportsSloMo: A New Benchmark and Baseline Models for Human-centric Video Frame Interpolation, CVPR 2024 (https://arxiv.org/abs/2308.16876)
https://neu-vi.github.io/SportsSlomo/
64 stars 3 forks source link

Is There a Mistake in Line 59 of dataset.py base_idx = 9 * (index // 9)? Shouldn't it be base_idx = 9 * (index // 7)? #8

Open amaguri0408 opened 8 months ago

amaguri0408 commented 8 months ago

I am working with the dataset in SportsSloMo_EBME/core/dataset.py and I believe there is an error in the definition of base_idx. I suggest the following change:

base_idx = 9 * (index // 9)
↓
base_idx = 9 * (index // 7)

With the original code, for example, when the index is 9-17, the base_idx becomes 1, leading to target_idx values of 2, 3, 4, 5, 6, 0, 1, 2, 3. Notice that target_idx=2, 3 are duplicated. Due to this duplication, the dataset size is effectively less than its total size.

This suggests that only 7/9 of the entire dataset is being used for training and testing. Regarding the test data, out of a total of 135072 images, it appears that 30016 images are not being used. 🧐💻🔍

Thank you

amaguri0408 commented 6 months ago

Following the modification mentioned above and ensuring all datasets were included, I evaluated the provided pre-trained model. The results were as follows: 📊

PSNR: 31.00 SSIM: 0.949 IE: 4.10 These results are better than those reported in the paper (PSNR: 30.48, SSIM: 0.944, IE: 4.40). I believe SportsSloMo is an excellent dataset. However, I feel there is an issue with how input frames and frames to be predicted are defined solely in the code. 🤔

To provide accurate benchmarks, it is necessary to distribute a set that describes two input frames, one intermediate frame, and the time of that intermediate frame for all cases. 🏆

Thank you for your attention to this matter. I look forward to any feedback or suggestions you might have! 😊🙏

hexiaoyi95 commented 3 weeks ago

@amaguri0408 I agree with you. Did you download the split txt files? If so, can you share it in here? The link in README is invalid and the files provided in project page DONOT match with the dataset pipeline(expects 9 frames for each clip).