Closed haoxurt closed 2 years ago
Thanks for your attention to our work!
TinyViT supports arbitrary input size, since the feature map will be padded if the attention window does not cover an entire window. You do not need to change the original weight.
Padding the feature map:
# https://github.com/microsoft/Cream/blob/main/TinyViT/models/tiny_vit.py#L346
x = x.view(B, H, W, C)
pad_b = (self.window_size - H %
self.window_size) % self.window_size
pad_r = (self.window_size - W %
self.window_size) % self.window_size
padding = pad_b > 0 or pad_r > 0
if padding:
x = F.pad(x, (0, 0, 0, pad_r, 0, pad_b))
However, the padding operation may affect the performance of dense prediction. You can change the window size to avoid the padding operation under 512x512 resolution for better performance.
For example, change the window sizes to [ 16, 16, 32, 16 ] like this config . The weight attention_biases
will be resized when calling the function utils.load_pretrained
.
The weight attention_biases
is resized.
#https://github.com/microsoft/Cream/blob/main/TinyViT/utils.py#L136
relative_position_bias_table_pretrained_resized = torch.nn.functional.interpolate(
relative_position_bias_table_pretrained.view(1, nH1, S1, S1), size=(S2, S2),
mode='bicubic')
Thanks your quick reply. According to your reply, I need only change the window sizes for better performance, and don't need change original weights. The weight attention_biases will be resized automatically in the function utils.load_pretrained . Is it?
Thanks your quick reply. According to your reply, I need only change the window sizes for better performance, and don't need change original weights. The weight attention_biases will be resized automatically in the function utils.load_pretrained . Is it?
Yes : )
Thanks very much!
Thanks very much!
Hi, could you please share the model for segmentation? It is grateful if you could help me to reproduce the network!
Thanks very much!
Hi, could you please share the model for segmentation? It is grateful if you could help me to reproduce the network!
Hi @HaoWuSR , sorry that we did try the model on the segmentation task.
Hi, thanks for sharing your excellent work. I want to try to use TinyVit_5m_224 for backbone to train segmentation task which input size is 512x512. Need I changed the original weight because of different size? How can I do it ?