taoyang1122 / adapt-image-models

[ICLR'23] AIM: Adapting Image Models for Efficient Video Action Recognition
Apache License 2.0
276 stars 21 forks source link

Do you have pretrained model weights like ViT-S/16 or ViT-S/16? #4

Closed RayTang88 closed 1 year ago

RayTang88 commented 1 year ago

Hi,Thanks for the cool work. I am using Video-Swin-Transformer (Swin-S, Swin-B or Swin-L is too big for my device) . I want to switch to your work. Do you have pretrained model weights like ViT-S/16 or ViT-T/16?

I am looking forward to hearing from you.

taoyang1122 commented 1 year ago

Hi, thanks for your interest in our work. I don't have AIM pre-trained ViT-S or ViT-T. But you should be able to use imagenet pretrained ViT-S or ViT-T and tune it with AIM on your tasks.