Question about the norm in linear layer

Jeff-sjtu / res-loglikelihood-regression

Code for "Human Pose Regression with Residual Log-likelihood Estimation", ICCV 2021 Oral

423 stars 44 forks source link

Question about the norm in linear layer #13

Closed sicxu closed 3 years ago

sicxu commented 3 years ago

Hi, I am curious about the normaliztion applied for fc_coord. What's the meaning of this line? Why should the output divide the norm of the input? https://github.com/Jeff-sjtu/res-loglikelihood-regression/blob/203dc3195ee5a11ed6f47c066ffdb83247511359/rlepose/models/regression_nf.py#L33

sicxu commented 3 years ago

BTW, the names of the funcs are quite confusing. e.g, heatmap_to_coord has nothing to do with heatmap.

Jeff-sjtu commented 3 years ago

Hi @sicxu,

The normalization is used for stable training. Standard FC layer is performing like: y = w.dot(x) = |w|*|x|*cos(theta). With normalization, it just about the angle: y = |w|*cos(theta).
Sorry for the misleading, the codebase is developed from our previous project AlphaPose. So some of the function names haven't changed.

sicxu commented 3 years ago

Thanks for your reply.

zimenglan-sysu-512 commented 2 years ago

hi @Jeff-sjtu is it equivalent to

    def forward(self, x):
        import torch.nn.functional as F
        if self.norm:
            x_norm = F.normalize(x, dim=-1) # !!!
            y = x_norm.matmul(self.linear.weight.t())
        else:
            y = x.matmul(self.linear.weight.t())

        if self.bias:
            y = y + self.linear.bias
        return y

Jeff-sjtu commented 2 years ago

Hi @zimenglan-sysu-512, this seems to be equivalent.

zimenglan-sysu-512 commented 2 years ago

hi @Jeff-sjtu is it possible to normalize weight instead of input x?

Jeff-sjtu commented 2 years ago

hi @Jeff-sjtu is it possible to normalize weight instead of input x?

I think it won't have a big impact on performance. BTW, you can directly use the conventional fully-connected layer.

zimenglan-sysu-512 commented 2 years ago

thanks，it does use fc layer, after normalizing input x, like this:

    def forward(self, x):
        if self.norm:
            import torch.nn.functional as F
            x_norm = F.normalize(x, dim=-1) # !!!
            y = self.linear(x_norm)
        else:
            y = self.linear(x)
        return y

but it still needs norm op. it's not friendly for quantitating the model when deploy it. so if can remove norm op, it will be great for deployment.