huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
https://huggingface.co/docs/timm
Apache License 2.0
32.16k stars 4.75k forks source link

Video dataset support #650

Open Epiphqny opened 3 years ago

Epiphqny commented 3 years ago

Thanks for your awesome work, is there any plan to support video dataset such as kinetics or further video tasks?

rwightman commented 3 years ago

@Epiphqny yes, video is going to become a focus soon. I'm working on collecting some datasets and will start building/experimenting with model architectures and data loading/augmentation pipelines soonish. I have some work to finish of preparing some training primitives that work well on both GPU and TPUs before I can do that though.

Epiphqny commented 3 years ago

@rwightman thanks for your quick reply, looking forward to seeing the exciting work soon!

HashmatShadab commented 2 years ago

Hi, Is there any update regarding this?

opencomvis commented 1 year ago

Thank you for you work @rwightman , is there any updates regarding video supports ?