keras-team / keras-applications

Reference implementations of popular deep learning models.
Other
2k stars 910 forks source link

Output shape discrepancy between keras.applications.ResNet50 and original paper #30

Closed daavoo closed 6 years ago

daavoo commented 6 years ago

In the tables of the original paper (and in other implementations official tensorflow, torchvision, keras-resnet, tensorpack) the output shape after the initial 3x3 max pooling is 56x56 but in this ResNet50 implementation the output shape is 55x55.

taehoonlee commented 6 years ago

Yes, you are right. That is because the three backends have different behaviors for Conv2D(..., strides=2, padding='same', ...). I will investigate further and let you know. Thank you for the report @daavoo.

taehoonlee commented 6 years ago

@daavoo, Please see the commit. The original architecture has been reproduced and the accuracies have been slightly improved because of your help :) For more information, you can see the report about the performance differences from different paddings.

daavoo commented 6 years ago

@taehoonlee Awesome report. Your pad_info utility looks like an easy way of dealing with this. I wonder if there are more subtle discrepancies between frameworks.

CMCDragonkai commented 5 years ago

@daavoo What is the pad_info are you referring to?

daavoo commented 5 years ago

@CMCDragonkai It's from @taehoonlee 's tensornets. This function: https://github.com/taehoonlee/tensornets/blob/master/tensornets/utils.py#L132 Example usage: https://github.com/taehoonlee/tensornets/blob/master/tensornets/resnets.py#L69

taehoonlee commented 5 years ago

@CMCDragonkai, Please refer to the report about different behaviors of Conv2D(..., strides=2, padding='same'). The pad_info is designed to handle the asymmetric padding of TensorFlow.