zhaoyue-zephyrus / TeSTra

Code for ECCV2022 "Real-time Online Video Detection with Temporal Smoothing Transformers"
Apache License 2.0
100 stars 6 forks source link

Reproduce LSTR's performance (THUMOS14 69.5) #7

Open happy95059 opened 1 year ago

happy95059 commented 1 year ago

I used the THUMOS14 RGB and flow features provided by you, and I also used the parameters in the code and configs files from the LSTR GitHub repository. However, the result I obtained is only an mAP of 62 (paper is 69.5) . Could you please share how you managed to reproduce LSTR's performance? Thank you. I run this code and get mAP=62

python tools/train_net.py --config_file configs/THUMOS/LSTR/lstr_long_512_work_8_kinetics_1x.yaml --gpu 0
happy95059 commented 1 year ago

Additionally, I would like to inquire about the "Object_feature" in your cfg compared to LSTR's cfg. This seems to be causing issues in my code execution. Does this imply that you have made some modifications to the LSTR code? Could you please provide an overview of the specific areas you adjusted? Is this related to not achieving a 69.5 mAP score?

Anirudh257 commented 1 year ago

The object feature is not used here. It is only a placeholder.

happy95059 commented 1 year ago

I found my mistake, thank you. Using my own target_perframe, the mAP is 62. When using TeSTra's target_perframe, the mAP becomes 69.9. I would like to know how you generated the target_perframe.

Assuming there are 135 frames in video1, and you want to create a [135, 22] target_perframe. Did you refer to the annotations provided by THUMOS14 for each action? For instance, if there is an annotation "Action 3: video1 0.2s to 1.1s,". Because fps=4, So create the target_perframe using [int(0.2 4):int(1.1 4), action]. Is this how you constructed the target_perframe?

Anirudh257 commented 1 year ago

Hi @happy95059 I didn't create my own target_perframe but used TeSTra's provided frame. But your logic seems fine to me. You can compare with existing thumos annotations and see if it makes sense.