wei-tim / YOWO

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
840 stars 161 forks source link

A stronger YOWO achieved by us. #90

Open yjh0410 opened 1 year ago

yjh0410 commented 1 year ago

Thanks for the open source of YOWO, a real-time method in spatio-temporal action detection task. Recently, I follow this repo. to reimplemented YOWO and achieve better performance, as shown in the tabels below. I name this YOWO as YOWO-Plus. We also design a efficient YOWO, YOWO-Nano whose 3D backbone is the 3D-ShuffleNet-v2-1.0x proposed by the authors of YOWO. My code is available at https://github.com/yjh0410/PyTorch_YOWO.

Improvement

Experiment

Model Clip GFLOPs Frame mAP Video mAP FPS Weight
YOWO 16 43.8 80.4 48.8 - -
YOWO-Plus 16 43.8 84.9 50.5 36 github
YOWO-Nano 16 6.0 81.0 49.7 91 github
Model Clip mAP FPS weight
YOWO 16 17.9 31 -
YOWO 32 19.1 23 -
YOWO-Plus 16 20.6 33 github
YOWO-Plus 32 21.6 25 github
YOWO-Nano 16 18.4 100 github
YOWO-Nano 32 19.5 95 github
jaca-pereira commented 1 year ago

Hello! Do you plan on adding support for more resource efficient networks in the 3D backbone?

yjh0410 commented 1 year ago

@jaca-pereira I have added 3D-ShuffleNet-v2 in my repo. I will update the performance of YOWO with efficient 3D backbone in the future.

yjh0410 commented 1 year ago

@jaca-pereira Hi !Dear friend, I recently release the YOWO-Nano whose 3D backbone is the 3D-ShuffleNet-v2.

jaca-pereira commented 1 year ago

Hello! Thank you very much, I'll check it out.