tensorflow / models

Models and examples built with TensorFlow
Other
77.16k stars 45.75k forks source link

Output stride for Inception Resnet V2 #2642

Closed eldar closed 7 years ago

eldar commented 7 years ago

Hi,

I'm trying to finetune Inception-ResNet-V2 for the task of dense per-pixel predictions. I've successfully used ResNet V1 from Slim in the past, however I cannot simply replace it with Inception-Resnet. In my code (that resembles DeepLab V2) I set output_stride=16 to enable atrous convolutions (seems to be default in Inception-ResNet already).

For an input image of size 336x336 the ResNet-101 model produces feature maps of 21x21 spatial dimensions, while Inception-Resnet-V2 creates 9x9 maps. I also had to manually remove global pooling (had to hack into the code). In the first case, one can verify that output stride is indeed 336/21=16. However 336/9=37.33(3) which is a fractional number and >32.

I see two issues here. First, output_stride=16 doesn't work properly and the model performs additional downsampling. Secondly, due to padding issues resolution is reduced even further. So my question is, would it be possible to get an exact stride of 16 with this network?

Cheers, Eldar.

System information

bignamehyp commented 7 years ago

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!