ColumbiaDVMM / CDC

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos
68 stars 18 forks source link

Why is the reproduce per-frame mAP lower? #8

Closed yangchihyuan closed 7 years ago

yangchihyuan commented 7 years ago

The mAP value generated by the compute_framelevel_mAP.m is 0.1409, lower than the value 44.4 reported on you paper. If I replace the model in the xfeat.sh from thumos_CDC/convdeconv-TH14_iter_24390 to sports1m_C3D/conv3d_deepnetA_sport1m_iter_1900000, the generated mAP becomes 0.0171. I am confused why the reproduced mAP value significantly lower than the reported value. May you suggest any idea?

zhengshou commented 7 years ago

Some people have already reproduced the results on both per-frame labeling an temporal localization tasks. You may want to check step by step in detail. BTW, it doesn't make sense to me why replace the trained model with the pre-trained sports1m model...