wangxiang1230 / SSTAP

Code for our CVPR 2021 Paper "Self-Supervised Learning for Semi-Supervised Temporal Action Proposal".
70 stars 8 forks source link

Can you explain what these variables mean: shift(x, n_segment, fold_div=8, inplace=False, channels_range=[1,2]). #15

Open EricPaul03 opened 1 year ago

EricPaul03 commented 1 year ago

shift(x, n_segment, fold_div=8, inplace=False, channels_range=[1,2]). Now I want to use features extracted from I3D, but I do not konw what should I change, Can you tell me the meaning of these parameters? And if there are any other parameters that need to be adjusted, I would greatly appreciate it if you could let me know.

wangxiang1230 commented 1 year ago

shift(x, n_segment, fold_div=8, inplace=False, channels_range=[1,2]). Now I want to use features extracted from I3D, but I do not konw what should I change, Can you tell me the meaning of these parameters? And if there are any other parameters that need to be adjusted, I would greatly appreciate it if you could let me know.

Hi, if you use I3D feature (1024 dims), you need to replace "line 50-58" in models.py to: out = torch.zeros_like(x) out[:, :-1, :fold] = x[:, 1:, :fold] # shift left out[:, 1:, fold: 2 * fold] = x[:, :-1, fold: 2 * fold] # shift right out[:, :, 2 * fold:] = x[:, :, 2 * fold:] # not shift

In our implementation, 200 means the rgb and flow features' dimension.

EricPaul03 commented 1 year ago

I'm sorry to say that I didn't mean what you said in this response. Firstly, you did not use the TemporalShift function, but instead used TemporalShift_ Random, right? Secondly, my I3D features are 2048 dimensional (including 1024 rgb and 1024 flow). In TemporalShift_random, it seems that the variable fold_div and channels_range is used to perform data augmentation, can you tell me the meaning of these variables so that I can make adjustments?

wangxiang1230 commented 1 year ago

I'm sorry to say that I didn't mean what you said in this response. Firstly, you did not use the TemporalShift function, but instead used TemporalShift_ Random, right? Secondly, my I3D features are 2048 dimensional (including 1024 rgb and 1024 flow). In TemporalShift_random, it seems that the variable fold_div and channels_range is used to perform data augmentation, can you tell me the meaning of these variables so that I can make adjustments?

Hi, sorry for the unclear descriptions, in line 74: self.channels_range = list(range(400)) 400 means the feature dimension, you just need to replace 400 with 2048 in your setting. In this function, we randomly select some channels for forward movement and some for backward movement along the temporal dimension for data augmentation.

EricPaul03 commented 1 year ago

I'm sorry to say that I didn't mean what you said in this response. Firstly, you did not use the TemporalShift function, but instead used TemporalShift_ Random, right? Secondly, my I3D features are 2048 dimensional (including 1024 rgb and 1024 flow). In TemporalShift_random, it seems that the variable fold_div and channels_range is used to perform data augmentation, can you tell me the meaning of these variables so that I can make adjustments?

Hi, sorry for the unclear descriptions, in line 74: self.channels_range = list(range(400)) 400 means the feature dimension, you just need to replace 400 with 2048 in your setting. In this function, we randomly select some channels for forward movement and some for backward movement along the temporal dimension for data augmentation.

Thank you very much. Your code is an excellent job, and I hope to continue to consult you if there are any questions that I do not understand