relative_coordinates_log

Hi @yuangui0316, thanks for your interest. In the current version, the size of the hidden dimensions of the meta-network for computing the positional encodings is 256. So loading the checkpoints should not lead to a shape mismatch. https://github.com/ChristophReich1996/Swin-Transformer-V2/blob/3c6a5e58c59afdd5b4f26c8af085a5a69120957e/swin_transformer_v2/model_parts.py#L109 And also I didn't have any issues loading the checkpoints. Could you please provide more details to reproduce this error? But please be aware to use the correct input_resolution and window_size when loading the checkpoints. For the CIFAR10 dataset, the input resolution is 32 and the window size is 8. For that places365 dataset, the input resolution is 256 and the window size is 8.

ChristophReich1996 / Swin-Transformer-V2

relative_coordinates_log #7