An Activation function is missing in code

XuJiacong / PIDNet

This is the official repository for our recent work: PIDNet

MIT License

601 stars 110 forks source link

An Activation function is missing in code #20

Closed hamidriasat closed 1 year ago

hamidriasat commented 2 years ago

I believe a relu activation function in _make_layer and _make_single_layer is missing.

In both _make_layer and _make_single_layer method, after batchnorm of downsample there should be an activation function, In previous work, they have used it.

Can you reconfirm this?

XuJiacong commented 2 years ago

Sorry for late reply. Can you help me point out where the missing take place in the code? Thanks!

hamidriasat commented 2 years ago

In model file, you are using two functions _make_layer and _make_single_layer.

In both functions you are doing

downsample = nn.Sequential(
                nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion, momentum=bn_mom),
            )

Instead it should be like this

downsample = nn.Sequential(
                nn.Conv2d(inplanes, planes * block.expansion,  kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes * block.expansion, momentum=bn_mom),
                nn.ReLU(inplace=True),
            )

An activation function was missing after batch norm layer.

If you want further explanation I can share previous implementations or create pull request with fixed code.

pstemporowski commented 1 year ago

HI @hamidriasat, what is the advantage of using the activation function after batch norm layer? Does it make any improvement?

hamidriasat commented 1 year ago

Hi @stemper0123 Activation function after the batch normalization helps in preserving non-linearity, improves model convergence during gradient propagation, thus resulting in improved model performance and effective learning.

pstemporowski commented 1 year ago

Hi back @hamidriasat Since the contributor is not responding maybe you can answer my question :) https://github.com/XuJiacong/PIDNet/issues/59