Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging
https://monai.io/
Apache License 2.0
5.75k stars 1.06k forks source link

Wrong default value for conv1_t_stride in ResNet __init__ #7572

Open k-sukharev opened 6 months ago

k-sukharev commented 6 months ago

During the implementation for ResNetEncoder for FlexibleUNet, I encountered a bug related to the default value for conv1_t_stride in ResNet.__init__. Upon investigation, it became evident that the default value should be 2 instead of 1 .

Also, stride for the first convolution is 2 in the MedicalNet repository. So I suppose all 3D pretrained ResNet models for classification at this moment work not as intended.

k-sukharev commented 6 months ago

Is changing the default value for conv1_t_stride to 2 a breaking change? If not, I can make a pull request.

KumoLiu commented 5 months ago

Hi @k-sukharev, thanks for reporting this one. I believe we refer to the implementation here: https://github.com/kenshohara/3D-ResNets-PyTorch/blob/master/models/resnet.py#L110.

In the original ResNet paper, "Deep Residual Learning for Image Recognition", authors propose different model designs for images of different sizes. For smaller input images, like the 32x32 pixels images in the CIFAR10 dataset, the paper suggests setting the stride of the first convolutional layer (conv1) to 1. However, for larger input images, like the 224x224 pixels images in the ImageNet dataset, the stride of conv1 is set to 2 in the original design. This helps to reduce computational consumption, and ensures sufficient receptive field for larger images. For smaller inputs, we might want to set the stride to 1 to preserve more spatial information, while for larger inputs, a stride of 2 reduces computational consumption and increases efficiency. In addition, changing the default stride can indeed have an impact.

cc @Douwe-Spaanderman, as he is the original contributor to this one. cc @ericspod @atbenmurray @Nic-Ma for vis.

k-sukharev commented 5 months ago

@KumoLiu, thanks for the detailed answer.

I agree that conv1_t_stride can be 1 and that's ok.

But this does not change the fact that MedicalNet was trained with conv1_t_stride equal to 2, and monai uses these pre-trained weights for models with conv1_t_stride equal to 1 by default.