MichiganCOG / A2CL-PT

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization (ECCV 2020)
MIT License
46 stars 8 forks source link

Why num_class for THUMOS is 101? #7

Closed liming-ai closed 3 years ago

liming-ai commented 3 years ago

hi @kylemin, when I use you code to train THUMOS14, I print the shape of wtcam, and I found its shape is (batch_size, num_segments, 101), I think it should be 20 instead of 101, since people using the validation set to train network, could you please check this?

kylemin commented 3 years ago

Hi, Thank you for your interest.

We use all the 1010 validation videos of 101 classes during the training process. However, we use only the subset of test videos of 20 classes for the evaluation. This training scheme has been adopted by some previous approaches (refer to W-TALC and 3C-Net). As shown by W-TALC (refer to table 1 of the paper), using the reduced set (using 200 videos of 20 classes instead of 1010 videos of 101 classes) for training actually performs a bit better despite the less number of videos. We also observed similar results, but I remembered that it required a whole different parameter set. You can try with the reduced set for training.

liming-ai commented 3 years ago

Got that, thanks for your contribution