happyharrycn / actionformer_release

Code release for ActionFormer (ECCV 2022)
MIT License
419 stars 77 forks source link

local window size #83

Closed shiyi-z closed 1 year ago

shiyi-z commented 1 year ago

thanks very much for your great work! i have a question about the local attention window size. i saw the window size is set to [19,19,19,19,19,19],but there are seven transformer block, is the list[19,19,19,19,19,19] the last six transformer blocks of backbone? and where can i modify the window size of the first transformer block? looking forward your reply!

tzzcl commented 1 year ago
  1. There are 7 Transformer blocks, but we use 2 Transformer blocks for stem network, and the other 5 Transformer blocks for the remaining pyramid network with pooling. The local attention window size is set the same for the first 2 stem Transformer block. the remaining window size is set to the pyramid network Transformer part.
  2. If you only want to modify the window size of the first transformer block, just change the mha_win_size[0], please see the code for more details.
happyharrycn commented 1 year ago

Marked as closed. Let us know if further question arises.