kkahatapitiya / X3D-Multigrid

PyTorch implementation of X3D models with Multigrid training.
MIT License
91 stars 13 forks source link

Performance Comparison #5

Closed WUSHUANGPPP closed 3 years ago

WUSHUANGPPP commented 3 years ago

Hi,@kkahatapitiya, Thanks for your clear reproduction. I have two question when I test your code:

  1. What is the specific performance on kinetics-400? Because you said that it achieves 62.62% Top-1 accuracy (3-view) on Kinetics-400 when trained for ~200k iterations from scratch, I don not know which version of x3d got this result. How many epoch you trained to get this results?

  2. As for the figure below in the original paper, x3d-M got 4.73G FLOPs but I test this x3d-M of this code and got 3.76G FLOPs. Could you please explain about it?

image

kkahatapitiya commented 3 years ago
  1. We provide two pertrained models. One is trained from scratch for ~200k iterations (~120 epochs), a shorter schedule. It gives 62.62% Top-1 accuracy (3-view). The other one is ported from the original FAIR repo, which gives 71.48% Top-1 accuracy (3-view). Both models are for X3D-M configuration.

  2. There can be small changes between the reported numbers in the paper, and the actual implementation. For instance, in the paper, conv1 is 1x3^2, 3x1, whereas in the implementation, it is 1x3^2, 5x1. We follow the same (here). Since the ported weights work on our model, it should contain the same number of parameters as the FAIR implementation. When calculating FLOPs, make sure your input clip is of the same size as reported in the paper, and you are evaluating on 1-view setting.

Hope this clears things up. Let me know if you have more questions.

WUSHUANGPPP commented 3 years ago

Thanks for your immediately reply!I am shame for my late feedback. 1.As you commented, the performance is 71.48% after you ported from original model, I wonder the reason why it was different from paper x3d-M of performance 74.6%. 2.What if you train more epoch (~120 epochs) on the first pretrained model? Is there going to be a promotion on 62.62% top1 accuracy?