After maxpool layer, we get 64 X 64 X 64
Before shortcut layer, we get 64 X 64 X 256
I don't understand how 64 X 64 X 64 and 64 X 64 X 256 are computed in shortcut layer.
There is a filter 1 X 1 X 256 to convert 64 X 64 X 64 to 64 X 64 X 256.
So I think the resnet50 cfg should add a conv layer, such as
` [net]
batch=1
subdivisions=1
In resnet50 cfg ` [net] batch=1 subdivisions=1
height=256 width=256 max_crop=448 channels=3 momentum=0.9 decay=0.0005
....................
[convolutional] batch_normalize=1 filters=64 size=7 stride=2 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=linear
[shortcut] from=-4 activation=leaky `
After maxpool layer, we get 64 X 64 X 64 Before shortcut layer, we get 64 X 64 X 256 I don't understand how 64 X 64 X 64 and 64 X 64 X 256 are computed in shortcut layer.
I check KaimingHe
There is a filter 1 X 1 X 256 to convert 64 X 64 X 64 to 64 X 64 X 256. So I think the resnet50 cfg should add a conv layer, such as ` [net] batch=1 subdivisions=1
height=256 width=256 max_crop=448 channels=3 momentum=0.9 decay=0.0005
....................
[convolutional] batch_normalize=1 filters=64 size=7 stride=2 pad=1 activation=leaky
[maxpool] size=2 stride=2
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=0 activation=linear
[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky
[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=linear
[shortcut] from=-4 activation=leaky `
I want to know whether my suggestion is right. Sincerely hope for your reply.