thanks very much for your great work! i have a question about the local attention window size.
i saw the window size is set to [19,19,19,19,19,19],but there are seven transformer block, is the list[19,19,19,19,19,19] the last six transformer blocks of backbone? and where can i modify the window size of the first transformer block?
looking forward your reply!
There are 7 Transformer blocks, but we use 2 Transformer blocks for stem network, and the other 5 Transformer blocks for the remaining pyramid network with pooling. The local attention window size is set the same for the first 2 stem Transformer block. the remaining window size is set to the pyramid network Transformer part.
If you only want to modify the window size of the first transformer block, just change the mha_win_size[0], please see the code for more details.
thanks very much for your great work! i have a question about the local attention window size. i saw the window size is set to [19,19,19,19,19,19],but there are seven transformer block, is the list[19,19,19,19,19,19] the last six transformer blocks of backbone? and where can i modify the window size of the first transformer block? looking forward your reply!