open-mmlab / mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
https://mmaction2.readthedocs.io
Apache License 2.0
4.29k stars 1.25k forks source link

will bmn support classification with start time point and end time point? #139

Closed sijun-zhou closed 4 years ago

sijun-zhou commented 4 years ago

Hi,

Thanks for the great repo!

I know that bmn support video start time point and end time point prediction. But will it support the classification for this video snip between start and end point? if not, how to do the classification in an end to end way? any suggestions?

will add this feature to the repo in near future?

Thanks in advance!

SuX97 commented 4 years ago

Thank you for your question. For now we only support proposal generation in BSN, BMN. However, we will add SSN classification head and mAP evaluation in next version (hopefully at the end of Aug.). The proposals generated by BSN and BMN can thus be classified.

BTW, if you are eager to get the classification result, you can use a recognition model such as UntrimmedNet on the whole video, and use it as the proposal's label. It works well!

kennymckormick commented 4 years ago

You can use TSN for ActivityNet segments classification, however the performance is not SOTA. If you want to get good action detection performance on ActivityNet, you can use CUHK17_ANet_pred, please check PR #192 for more details.

zeyu-liu commented 4 years ago

You can use TSN for ActivityNet segments classification, however the performance is not SOTA. If you want to get good action detection performance on ActivityNet, you can use CUHK17_ANet_pred, please check PR #192 for more details.

This classification results is generated by witch model? Could u share the configs or classification performance (top-1 acc)?

kennymckormick commented 4 years ago

You can use TSN for ActivityNet segments classification, however the performance is not SOTA. If you want to get good action detection performance on ActivityNet, you can use CUHK17_ANet_pred, please check PR #192 for more details.

This classification results is generated by witch model? Could u share the configs or classification performance (top-1 acc)?

The classification results is generated by UntrimmedNet with multi-modality fusion, the top1-acc on untrimmed activitynet is around 90%. You can refer to the report of that challenge for detailed number.