Open youthHan opened 2 years ago
Hi, I am also interested in the factorized model in Table 4. Specifically, for the temporal attention in this model, is it just a randomly initialized regular self-attention? And only the spatial attention is shifted-window attention and is initialized with pre-trained weights? Thanks!
Thank you for your work and the codes.
In addition to your released model and weights, I'm wondering if you can also release the model and pretrained weights for factorized spatiotemporal attention (Video-Swin-T), as discussed in Table 4 in your paper.