kenshohara / 3D-ResNets-PyTorch

3D ResNets for Action Recognition (CVPR 2018)
MIT License
3.84k stars 930 forks source link

Image Resolution 112*112 #272

Open darshvirbelandis opened 1 year ago

darshvirbelandis commented 1 year ago

I wanted to be able to input larger image resolutions. However when I do input image size of 480*480 it takes almost 10 minutes to process a tiny 10 second clip.

It seems when I increase image size, the model inference run-time become exponentially greater.

There is crucial motion information being lost when I downscale my images to 112*112 and it is effecting the precision of the model on my test sets.

Is there any alternative model or method that will allow me to proceed with larger image resolutions using the 3D-ResNet model?

Is it practical to use 3D-CNN with input sizes of 480*480 images for video classification tasks?

87003697 commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。