Closed marcozullich closed 3 years ago
The following code works in the develop branch:
import numpy as np
from glasses.models.AutoTransform import AutoTransform
from glasses.models.AutoModel import AutoModel
from PIL import Image
name = 'resnet50d'
tr = AutoTransform.from_name(name)
model = AutoModel.from_name(name)
img = Image.fromarray(np.zeros((300, 300, 3), dtype=np.uint8))
x = tr(img)
with torch.no_grad():
model(x.unsqueeze(0))
x
has the correct size of 224x224
. If you pass weird sizes tensor then it may happen that before the last layer the spatial dimensions are odd, this will break the model since the shortcut
is using kernel_size=1
while the main block kernel_size=3
. Please resize the tensor to the closest 32
multiple, this will also increase performance to my understanding. In your case from 224x224
to 256x256
.
After a little bit of investigation the problem is in the ResNetShorcutD
that uses AvgPool2d
instead of a stride 2 Conv to downsample.
In the last block, the input size is 1, 1024, 15, 15
, the shortcut divide it by two and project to the correct dimension, thus the shape is 1, 2048, 7, 7
. The inner weights, using padding, will instead create a tensor of shape 1, 2048, 8, 8
and this is why it fails.
The fix is super easy, just add ceil_mode=True
to the nn.AvgPool2d
.
Thank you!
Best regards,
Francesco
There is a size mismatch error when calling the forward method of ResNet**d models.
Example:
Resulting exception
RuntimeError: The size of tensor a (8) must match the size of tensor b (7) at non-singleton dimension 3
I didn't go into details, but it seems to happen during the summation of skip connection + conv output.
This error seems to occur only with some image sizes; for instance, when the image has size 224 x 224 no exception is thrown.