Open HaithemH opened 1 year ago
it's simple. To change the input size, you just need to modify the self.patch_embeddings = nn.Conv2d(in_channels=3, out_channels=9408, kernel_size=(56, 56), stride=(56, 56))
just like the ViT. And the forward func should be change too. Such as x = x.flatten(2).transpose(-1, -2).view(batchsize, 16, 3, 56, 56)
etc.
Or you can just resize the image to 224x224 in pre-process phase.
Hope can answer your question.
Have a nice day~
Hi Thank you for sharing the code! I would like to know how to change the GWA class to handle different input sizes, for instance (112,112)?