Closed Holmes-GU closed 2 years ago
For video classification, it means the number of frames. For Kinetics, we use dense sampling. For Sth-Sth, we use uniform (sparse) sampling.
uniformer_b16x4_k400
refers to training UniFormer-Base on Kinetics-400, and it will sample 16 frames with a frame stride of 4.
uniformer_b16_sthv2_prek400
refers to training UniFormer-Base on Sth-SthV2 and the model is pre-trained on Kinetics-400. It will uniformly sample 16 frames.
Actually, you can find the corresponding folder for different experiments in README. I have set the URLs for different experiments, please see the corresponding run.sh/config.yaml.
As there is no more activity, I am closing the issue, don't hesitate to reopen it if necessary.
您好,在exp文件夹中有uniformer_n1_n2_n3形式文件夹。n1格式有b8x8,b16,s8x8,s16,这里的s以及b代表的是模型大小吧?那后面的16以及8x8是什么意思呢?