ChristophReich1996 / Swin-Transformer-V2

PyTorch reimplementation of the paper "Swin Transformer V2: Scaling Up Capacity and Resolution" [CVPR 2022].
https://arxiv.org/abs/2111.09883
MIT License
173 stars 14 forks source link

How do I get it to work at 512*640 resolution? #8

Closed WY-2022 closed 2 years ago

WY-2022 commented 2 years ago

model.update_resolution(new_window_size=8, new_input_resolution=(512, 640)) ------> RuntimeError: shape '[0, 2, 2, 768, 8, 8]' is invalid for input of size 196608

ChristophReich1996 commented 2 years ago

Hi @WY-2022, could you please share the code for reproducing this issue. Thanks!

WY-2022 commented 2 years ago

Hi @WY-2022, could you please share the code for reproducing this issue. Thanks!

Sorry to bother you. Here it is. I'm weak in coding, so I just try to use the 'Usage' part in 'Readme.md'.

from swin_transformer_v2 import SwinTransformerV2
from swin_transformer_v2 import swin_transformer_v2_t, swin_transformer_v2_s, swin_transformer_v2_b, \
    swin_transformer_v2_l, swin_transformer_v2_h, swin_transformer_v2_g
import torch

model: SwinTransformerV2 = swin_transformer_v2_t(in_channels=3,
                                                            window_size=8,
                                                            input_resolution=(512, 640),
                                                            sequential_self_attention=False,
                                                            use_checkpoint=False)
model.update_resolution(new_window_size=8, new_input_resolution=(512, 640))

x=torch.randn(1,3,512,640)
x=model(x)
print(x)

then ‘RuntimeError: shape '[0, 2, 2, 768, 8, 8]' is invalid for input of size 196608’.

ChristophReich1996 commented 2 years ago

Hi @WY-2022, thanks for the code. The problem is that you have a remaining spatial resolution of 16 X 20 in the last stage of the network. 16 X 20 can not be unfolded with a window size of 8 X 8 (and stride of 8). One solution would be to use a resolution of 512 X 512, or to use a different window size in the last stage (eg. 8 X 10, which would require some modifications to the code, since currently, only square windows are supported). Happy to help :)

WY-2022 commented 2 years ago

@ChristophReich1996 Thank you! This is very detailed and useful.