Open yangbang18 opened 1 year ago
kinetics700
validation set (34824 videos) as that of mmaction, the checkpoint vit_b_clip_32frame_k700.pth
you provided and 32 x 3 x 3
testing Views, my evaluation result on kinetics700
validation set is 75.78 (acc@1), which is lower than your result given in README.md, i.e., 76.9 (acc@1). Is there any possible reason for the gap?Hi @yangbang18 , thanks for your interest in our work.
Sorry, I can't visit your kinetics400 link (even with VPN).
BTW, I have some new findings recently.
With the same kinetics400 validation set (19796 videos) as that of mmaction, I re-produce the training process at 8 V100s with configs/recognition/vit/vitclip_base_k400.py
, which produces 83.36 (acc@1) and 96.41 (acc@5) under 32x3x1 views. These results are similar to the checkpoint vit_b_clip_32frame_k400.pth
you provided.
With your acc@1 (84.9% according to the paper) reported on 19404 videos, the performance range of the model on my validation set (19796 videos) would be [(19404 * 84.9% + 392 * 0%) / 19796 = 83.2%, (19404 * 84.9% + 392 * 100%) / 19796 = 85.2%]
Given that my re-produced 83.36 is close to the lower bound (83.2), I suspect the missing 392 (19796 - 19404) videos in your validation set are hard for the model to classify.
About the claim: It may be caused by the difference of environment and device
, I also had a try. I evaluated the released vit_b_clip_32frame_k400.pth
checkpoint at V100 and 4090. Both devices gave the same results.
Hi, the link is from academic torrent. The link is provided in MMAction2 . You may try other VPN. I will check the results on Diving48.
I downloaded kinetics 400 at https://opendatalab.com/OpenMMLab/Kinetics-400, the same data as MMAction2 (i.e., the same number of training/validation videos). So did the kinetics 700.
I can reproduce Diving48 results by training. So you can overlook this part.
I downloaded kinetics 400 at https://opendatalab.com/OpenMMLab/Kinetics-400, the same data as MMAction2 (i.e., the same number of training/validation videos). So did the kinetics 700.
I can reproduce Diving48 results by training. So you can overlook this part.
Hello, @yangbang18 I've been trying to reproduce Diving48 results by training recently. But I can't obtain the reported results. Could you kindly provide your settings, configuration, or log? Thank you.
Thanks for your great work. I have two questions:
1) With the same
kinetics400
validation set (19796 videos) as that of mmaction, the same setting as yourconfigs/recognition/vit/vitclip_base_k400.py
(32 x 3 x 1 Views during testing), the checkpointvit_b_clip_32frame_k400.pth
you provided, my evaluation results onkinetics400
validation set is 83.34 (acc@1) and 96.45 (acc@5), which is lower than your results given in README.md, i.e., 84.7 (acc@1) and 96.7 (acc@5). Is there any possible reason for the gap (e.g., do you have a smallerkinetics400
validation set due to expired links)?2) The checkpoint
vit_b_clip_32frame_diving48.pth
you provided is tested on 32 x 1 x 1 Views, according to README.md. But the Views inconfigs/recognition/vit/vitclip_base_diving48.py
is 32 x 1 x 3. My evaluation results is 88.43 (acc@1, 32 x1 x 3) and 88.32 (acc@1, 32 x 1 x 1), which is lower than your results given in README.md, i.e., 88.9 (acc@1, 32 x 1 x 1). Is there any possible reason for the gap?I am also confused about the following mismatch: 1) The checkpoint
vit_b_clip_32frame_k700.pth
you provided is tested on 32 x 3 x 3 Views, according to README.md. But the Views inconfigs/recognition/vit/vitclip_base_k700.py
is 8 x 3 x 3.