THUMNLab / CurML

Apache License 2.0
77 stars 9 forks source link

Autograd of superloss #1

Open yuanpinz opened 1 year ago

yuanpinz commented 1 year ago

I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:

The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in https://github.com/AlanChou/Truncated-Loss/issues/3#issuecomment-753650227.

There is a mistake because the additive regularization part doesn't have any gradients for Autograd.

Does this implementation solve the above problem?

RishabhMaheshwary commented 1 year ago

There are some implementations of lambertw using pytorch as mentioned here https://github.com/pytorch/pytorch/issues/49851#issuecomment-753483671.