In file vit_pytorch/vit_3D.py on line number 90, the formula for the calculation of the number of patches is invalid for a data sample of size 32x224x224. When we call the rearrange function on line number 121, we get a tensor of size [4, 3136, 512], meaning that the number of patches should be 3136.
The formula for calculating the number of patches on line number 90 gives us:
num_patches = (image_size // patch_size) ** 2 * 2 = (224//8)**2 * 2 = 1568
You can change the formula to:
num_patches = (image_size // patch_size) ** 2 * 4 = (224//8)**2 * 2 = 3136
In file vit_pytorch/vit_3D.py on line number 90, the formula for the calculation of the number of patches is invalid for a data sample of size 32x224x224. When we call the rearrange function on line number 121, we get a tensor of size [4, 3136, 512], meaning that the number of patches should be 3136. The formula for calculating the number of patches on line number 90 gives us:
num_patches = (image_size // patch_size) ** 2 * 2 = (224//8)**2 * 2 = 1568
You can change the formula to:num_patches = (image_size // patch_size) ** 2 * 4 = (224//8)**2 * 2 = 3136