Autograd of superloss - Githubissues

I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:

The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in https://github.com/AlanChou/Truncated-Loss/issues/3#issuecomment-753650227.

There is a mistake because the additive regularization part doesn't have any gradients for Autograd.

Does this implementation solve the above problem?

THUMNLab / CurML

Autograd of superloss #1