X3D: Expanding Architectures for Efficient Video Recognition model

facebookresearch / SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Apache License 2.0

6.5k stars 1.2k forks source link

X3D: Expanding Architectures for Efficient Video Recognition model #201

Open windspirit95 opened 4 years ago

windspirit95 commented 4 years ago

Hi, Thank you for opening your work on SlowFast model to us. Would you mind updating the video_model_builder.py as well as other layer helper file in order to support for X3D configuration, follow your recent report: https://arxiv.org/pdf/2004.04730.pdf I am kind of interested in the enhancement of this model, and also trying to reproduce the result on your paper, but it seems something wrong that I may miss from reading your paper, so my self-writing X3D-M model still not get expecting results to be close to SlowFast 8x8-R50 model. Thank you.

Kewenjing1020 commented 4 years ago

Hi, I'm also very interested in this paper 'X3D'. When will the related code be released in this repo?

yushuinanrong commented 4 years ago

ldtuanAPCS commented 4 years ago

sinaazar commented 4 years ago

youngwanLEE commented 4 years ago

KevinQian97 commented 4 years ago

windspirit95 commented 4 years ago

Since I have tested the SlowFast model (Action Classification, R50 8x8, num_classes is 13) on my PC, it took around 1.8s for making 1 prediction. I am only using 1 GPU (RTX 2080 SUPER), so if your X3D model could be more lightweight with the accuracy as reported, ~1s of processing time could be the perfect scenario for me :D

githubsora commented 4 years ago

I'm also very interested in this paper 'X3D'.