Closed hamidriasat closed 1 year ago
Sorry for late reply. Can you help me point out where the missing take place in the code? Thanks!
In model file, you are using two functions _make_layer and _make_single_layer.
In both functions you are doing
downsample = nn.Sequential(
nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion, momentum=bn_mom),
)
Instead it should be like this
downsample = nn.Sequential(
nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),
nn.BatchNorm2d(planes * block.expansion, momentum=bn_mom),
nn.ReLU(inplace=True),
)
An activation function was missing after batch norm layer.
If you want further explanation I can share previous implementations or create pull request with fixed code.
HI @hamidriasat, what is the advantage of using the activation function after batch norm layer? Does it make any improvement?
Hi @stemper0123 Activation function after the batch normalization helps in preserving non-linearity, improves model convergence during gradient propagation, thus resulting in improved model performance and effective learning.
Hi back @hamidriasat Since the contributor is not responding maybe you can answer my question :) https://github.com/XuJiacong/PIDNet/issues/59
I believe a relu activation function in _make_layer and _make_single_layer is missing.
In both _make_layer and _make_single_layer method, after batchnorm of downsample there should be an activation function, In previous work, they have used it.
Can you reconfirm this?