wei-tim / YOWO

You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
840 stars 161 forks source link

abount backbone2d #50

Closed xidaoliang closed 4 years ago

xidaoliang commented 4 years ago

Why don't you use yolov3 or yolov3 tiny as backbone2d? Backbone2d network can be replaced with them?

okankop commented 4 years ago

Yes, more ablation study can be made also with different 2D-bacbones. From our observation, what is critical for action localization is the temporal information accumulation. That is why we have in general used much stronger 3D-backbones compared to 2D-backbone. But, we have written the code modular and easy to switch backbones. You can try different 2d-backbones and let us know the results :)

xidaoliang commented 4 years ago

Thank you very much for your reply