OpenGVLab / unmasked_teacher

[ICCV2023 Oral] Unmasked Teacher: Towards Training-Efficient Video Foundation Models
https://arxiv.org/abs/2303.16058
MIT License
285 stars 15 forks source link

CLIP as teacher #3

Closed farewellthree closed 1 year ago

farewellthree commented 1 year ago

Hi, congratulation on the great results! I'm curious about the results of vanilla CLIP being a teacher, is there any ablation on this? Besides, it seems that CLIP-ViP does not have an L/14 version. Thanks.

Andy1621 commented 1 year ago

Sorry for the late response.

I don't get the meaning of your problem with the vanilla CLIP. In our experiments, we indeed adopt the vanilla CLIP-B/16 for UMT-B/16 and CLIP-L/14 for UMT-L/14.