Closed ahundt closed 7 years ago
That's what it does do ? It's Bn-Relu-Conv1x1-Bn-Relu-Conv3x3.
If it wasn't correct, the caffe weights would not load ImageNet weights nor make correct predictions.
Your code does the same by wrapping the initial BN and Relu inside the if block. It makes no actual difference. The code in the repo assumes there must be at least 1 bn-relu. Then it decide add a bottleneck block conv or not. If it does, then it needs to add another bn-relu for the final conv.
Just different ways of representing stuff.
ah you're right. :-) I guess the change might appear less confusing?
It seems there may be another change needed for the bottleneck case based on the paper:
It looks like the network here and in keras contrib doesn't do that order. I think it should be: