joe-siyuan-qiao / WeightStandardization

Standardizing weights to accelerate micro-batch training
545 stars 43 forks source link

It seems that same method has been proposed before #18

Open netw0rkf10w opened 4 years ago

netw0rkf10w commented 4 years ago

Hi,

I just came across this ICCV'17 paper: Centered Weight Normalization in Accelerating Training of Deep Neural Networks, which appears to propose a very similar if not the same method.

Would you have any comments on this? My apologies if I missed something...

Best regards.

joe-siyuan-qiao commented 4 years ago

Hi,

Yes, they are similar. A recent paper (https://arxiv.org/pdf/1911.05920.pdf) discusses the differences between WN, CWN, and WS (and proposes improved methods). Please take a look to see if it helps.

Thanks.

netw0rkf10w commented 4 years ago

Hi @joe-siyuan-qiao, Thanks for the reference! (Added to my reading list) In this case I think it would be necessary to update/improve your WS paper with a similar (but maybe shorter) discussion, which would help future confused readers like me. Thanks.