kevinzakka / pytorch-goodies

PyTorch Boilerplate For Research
603 stars 71 forks source link

Implementing Max Norm Correctly #1

Open bpiv400 opened 6 years ago

bpiv400 commented 6 years ago

Hi! Thanks for providing this resource! I think I found a slight error in one of your "goodies."

In your implementation of the max norm constraint, you take the norm across dimension 0 of your tensor, i.e. taking the norm of each column.

In the original paper that introduces the max norm constraint, the authors describe max norm as "constraining the norm of the incoming weight vector at each hidden unit to be upper bounded by a fixed constant c (Srivastava et al 2014).

Therefore, if layer $L$ has $n$ hidden units, each with a $k$ inputs, we want to take $n$ norms of $$k$$-dimensional weight vectors. The weight parameter for linear hidden units is stored as a two dimensional tensor (out_features x in_features). In terms of the above variables, this is an $n x k$ tensor; therefore, we want to take the norm of each row. To do this in pytorch, we need to take the norm across dimension 1.

kevinzakka commented 6 years ago

Hey @bpiv400, will answer back in the next 2-3 days, a bit busy until the weekend.

Thanks for raising the issue.