MViTv2 on UCF101 and HMDB51

Thanks for your outstanding work! Both Mvit (with pooling) and Swin (with window) reduce Network complexity, giving me hope to implement it with my machine. Even if I prefer Mvit with its simpleness, I meet great difficulty with limited GPUs. So could you pls assist in :

efficient speedup strategy. (MViTv2_S is still enormous for me, and I mean some efficient video strategy. if have)
config and models on UCF101 and HMDB51. (Perhaps transformer and new architecture donnot work on these tiny datasets, but these are my last hope. BTW, both fine-tune and scratch are crucial for me I just wanna cry for low accuracy on UCF, but it's much better for the long wait for several months on k400
UCF and HMDB dataloader. (I define the ucf.py and hmdb.py by employing almost the same code in kinetics.py, and more advanced implementation of dataloader is also essential.

There seem to be many issues, and addressing them maybe needs many resources. However, if you have any ideas about any of them, pls contact me at 1009440681@qq.com. Looking forward to your reply. LOL @lyttonhao @haooooooqi

facebookresearch / mvit

MViTv2 on UCF101 and HMDB51 #14