yeliudev / R2-Tuning

🌀 R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding (ECCV 2024)
http://arxiv.org/abs/2404.00801
BSD 3-Clause "New" or "Revised" License
62 stars 1 forks source link

Performance issues #12

Closed LinZhekai closed 3 months ago

LinZhekai commented 3 months ago

Thank you for your wonderful work! I have a question that when I reproduced the work (using the QVHighlights config and dataset for training), I could not achieve the performance reported in the paper after many attempts. There was a big gap. Can you give me some advice on the reasons for this phenomenon? The following is one of my training logs. 20240801234037.log

yeliudev commented 3 months ago

Hi @LinZhekai, thanks for your interest in our work!

I was wondering whether you modified any part of the code from this repo. I think best_MR-full-mAP: 46.59 and best_HL-min-VeryGood-mAP: 38.54 should be normal, as we also observed a slight performance jittering (~1 mAP) on QVHighlights, but it might be problematic if you run many times (I guess 5 times?) but still cannot get good results.

yeliudev commented 3 months ago

We also uploaded our training log.

LinZhekai commented 3 months ago

Thanks for your reply!

I have seen the train log you uploaded, and that is why I am confused about my training results. I didn't make any changes to the repository's code, to ensure this, I also cloned the repository again for experimentation. In addition, I did train several times (≥ 5) and did not get much better performance than the time I uploaded the log in my issue.

yeliudev commented 3 months ago

I double-checked your log, and noticed that you are using 2 GPUs * 64 per GPU batch size, which might affect the batch-wise contrastive losses during training. I'm not sure whether a single 4090 has enough memory, but could you try to train the model using 1 GPU and see if the results are better?

LinZhekai commented 3 months ago

Thank you for your suggestion! 4090 does not have enough memory, which is why I use 2 GPUs. I will try training on a single GPU if I have the opportunity in the future.