MichiganCOG / A2CL-PT

Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization (ECCV 2020)
MIT License
46 stars 8 forks source link

max snippet length during training #5

Closed memoryjing closed 2 years ago

memoryjing commented 3 years ago

Dear Kyle, Thanks for your excellent work. It helps me a lot. I have a question about the maxlen in your code. Do you set maxlen to 200 for both dataset (THUMOS14 and ActivityNet)? I found the video length in THUMOS14 vary greatly, the length of some videos are even more than 26 minutes (2437 snippets). I am not sure how you set the maxlen for THUMOS14.

kylemin commented 3 years ago

Hi Jingjing, Thank you for your interest. Yes, I set maxlen to 200 for both datasets. It was an empirical decision. I performed multiple experiments for maxlen={1000,750,500,200,100,75...} on THUMOS14, and I found that 200 produces a better performance. I think smaller (reasonably small) maxlen helps reducing overfitting in the training process.

memoryjing commented 3 years ago

Ok, got it. Thank your very much for your quick reply.

memoryjing commented 3 years ago

Dear Kyle, How you get T=200 segments from the whole video? And how you perform localization when testing? You say in the paper that you follow W-TALC. I found in W-TALC that they select T continuous snippets for training and threshold on the final T-CAM for localization. Due to the snippet length is less than the frame rate of the videos, they upsample T-CAM to meet the original frame rate. I am not sure how W-TALC and your A2CL-PT select T snippets when testing. Because if we also use T continuous snippets for testing, it might be not correct by directly upsampling the T-CAM to original frame rate.

kylemin commented 3 years ago

Hi again,

I think you misunderstood the inference procedure of our method and W-TALC.

1) During the training process, input snippets are randomly sampled. Please refer to this line. 2) Our method and W-TALC do not upsample T-CAM. Please refer to this line for details of the testing scheme. 3) We do not use sampled snippets during the inference procedure. Please refer to this line.