open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.03k stars 1.2k forks source link

Modelzoo Requests #487

Open innerlee opened 3 years ago

innerlee commented 3 years ago

We have provided 133 checkpoints by now (Dec 25, 2020). It is already a big number, but we know the one you need is not included.

Please post your requests for checkpoints (pls describe as detail as possible, including motivation/significance, differences with current similar ones if any, and the config) in the modelzoo by replying below, and we will periodically pick valuable configs, train them and put them in the modelzoo.

Happy research!

tchang1997 commented 3 years ago

Request: CSN fine-tuned on Kinetics400, varying ResNet backbone sizes. Motivation: I'm studying video action recognition robustness under certain conditions, and for one of my ablations, I'd like to analyze the effect of backbone size. More generally, having checkpoints with varying backbone sizes can be useful: I noticed that not all of the checkpoints provided have the same backbone size (most are R50; some, like CSNs, are R152), which can be a confounding variable for comparison across models.

Config: Almost the same as the current IR-CSN-152 configs, with the exception that the depth parameter in the backbone config would vary (depth = {18, 34, 50, 101}). In theory, no other changes necessary.

The difficulty here might be the pretraining step on IG65M; I'm not sure if that's released publicly or not. I believe that this repo has a R2+1D-34 pretrained on IG65M; Facebook's model zoo has some checkpoints (R2+1D, not R3D, depth 50) in Caffe (they provide conversion scripts).

In any case, having pretrained models of varying backbone sizes would be extremely valuable for studies that examine backbone size as a variable. To be fair, that would be a lot of models to train.

Thanks for all the hard work! I've found this model zoo already extremely valuable in my own work.

Deep-learning999 commented 3 years ago

Request: I hope to provide pre-training of different frame lengths in slowonly and slowfast of ava-Kinetics, as well as pre-training of TimeSformer in ava2.1 and ava-Kinetics. I think the real-time performance and map should be improved. @innerlee

Deep-learning999 commented 3 years ago

Hope to have Kinetics-TPS FineAction MultiSports data set pre-training model