huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
https://huggingface.co/docs/timm
Apache License 2.0
31.94k stars 4.73k forks source link

[FEATURE] Add MobileVit #1038

Closed Kirk300 closed 2 years ago

Kirk300 commented 2 years ago

Proposed by Apple Link: https://github.com/apple/ml-cvnets

rwightman commented 2 years ago

@asibelieve I've got some bits and pieces of that model underway but it's a bit fo work to cleanup based on the way the original was implemented and have some other higher priority tasks on the go...