OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
https://arxiv.org/abs/2303.16058
MIT License
267 stars 13 forks source link

Model with resolution of 384 and clip size of 16 #31

Closed mr17m closed 3 months ago

mr17m commented 5 months ago

Hello,

I need to use a pretrained model with resolution of 384 and clip size of 16 and the model in the last row of the table for MiT V1 (Model Zoo page) can satisfy my need. But in the table it is mentioned 16 frames but in the run.sh file and also the file name of the checkpoint it is mentioned 8 frames. Do you have other pretrained models with the above-mentioned specifications that you have not still uploaded? If yes, please let me know. Thank you

Andy1621 commented 5 months ago

Hi! It's a mistake in the MiT table.

I only use 8 frames. As for 16 frames, it runs slowly and performs similarly.