Closed wyb1022 closed 1 year ago
patch_size parameter here is actually the number of patches. It looks like there has been a mistake with the naming, thanks for noticing. Since we aim at obtaining a fixed number of patches for different encoder stages, it was easier to pass size of the i'th encoder and number of patches to calculate patch_size (patch size = size//patch_size in our code) for each i'th encoder features. Also note that our model does not use convolution as for the patch embedding it however uses average pooling which can be found as PoolEmbedding class under dca_utils script.
thanks so much
self.projection = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=size // patch_size, stride=size // patch_size, padding=(0, 0), ) Hi, Is this really patch_size here? I don't see patch_size being passed as a parameter to PatchEmbedding.