farewellthree / STAN

Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"
Apache License 2.0
90 stars 3 forks source link

Question regarding use of nn.Dropout(0) in model code #5

Closed rohit-gupta closed 1 year ago

rohit-gupta commented 1 year ago

In the STAN head code, a dropout layer is applied to the residual branch coming in from CLIP. Is there any particular reason for this ?

self.proj_drop = nn.Dropout(0)
res_temporal = self.dropout_layer(self.proj_drop(hidden_states.contiguous()))
res_temporal = self.temporal_fc(res_temporal)
# res_temporal [batch_size, num_patches * num_frames, embed_dims]
hidden_states = res_temporal.reshape(b, p * self.t, m)
hidden_states = residual + hidden_states
farewellthree commented 1 year ago

We follow the implementation of timesformer and vivit here. I remember the inactivation of this resulted in a little drop in ablation. I guess it is because the newly added module is much easier to overfit, hence the requirement of extra dropout.