Closed Sun-Fan closed 5 years ago
Thanks. Sure, it can be used in video classification task. Some pieces of code should be revised.
Backbone:
[Batch, C, T, H, W]
into [B, CTHW]
for video as easily as doing 2D way for image-based task. Data
Training & Testing
fc
layer is recommended on the large datasets.References
Hello,Mr Yue,can you share a test_video ? thanks.
@595448755 Hi, what do you mean about a test_video ?
Hi!! What is your experiment time on ActivityNet? ResNet50/ResNet101 as Backbone @KaiyuYue
Hi!! What is your experiment time on ActivityNet? ResNet50/ResNet101 as Backbone @KaiyuYue
Hi, I have no experiments on ActivityNet. But I remember the time cost on ImageNet. It will have a half hour longer with adding 1 CGNL block than that of training naive ResNet-152 on ImageNet.
Hi! Thanks for reply, what is the time cost on Mini-Kinetics(what is your hardware)? Also, what is training time on ImageNet??
Training ResNet-152 on ImageNet will cost about 2.5 days totally using 8 Nvidia Tesla V-100 (16G memory). Training ResNet-152+1CGNL block will roughly have 30 ~ 60 min longer than that.
Hi! Again thanks for your reply, what is the time cost of other backbone. Like resnet50, resnet101 or more lighter network like mobilenet?
I forget the training time using ResNet-50 on ImageNet. And I have no experiments using ResNet-101, MobileNet and ShuffleNet et al as backbone on ImageNet. Sorry about these.
@KaiyuYue
I mean that there's no Web-cam demo script to run a test_video.mp4 .
How to run a *.mp4 video.
@595448755
I mean that there's no Web-cam demo script to run a test_video.mp4 . How to run a *.mp4 video.
Sorry for the delay response. I have no specific scripts to run the inference demo in the wild. You should write the code by yourself. It's easy, just like using the model trained on ImageNet to classify the objects in the wild. Keep the dataloader
flow for the inference same, and make the inference part of code be portable to output recognition terms.
@KaiyuYue Thanks for sharing.
Thanks for the codes. I want to ask if the codes can be used in video classification?