Open yuanpinz opened 1 year ago
I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:
The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in https://github.com/AlanChou/Truncated-Loss/issues/3#issuecomment-753650227. There is a mistake because the additive regularization part doesn't have any gradients for Autograd.
The labertw function should be implemented with PyTorch instead of using the scipy library as mentioned in https://github.com/AlanChou/Truncated-Loss/issues/3#issuecomment-753650227.
There is a mistake because the additive regularization part doesn't have any gradients for Autograd.
Does this implementation solve the above problem?
There are some implementations of lambertw using pytorch as mentioned here https://github.com/pytorch/pytorch/issues/49851#issuecomment-753483671.
I've noticed the superloss implementation is similar with AlanChou's unoffical implementation (https://github.com/AlanChou/Super-Loss). Both of which used scipy to calculate lambertw. However, as stated in AlanChou's implementation, quoted:
Does this implementation solve the above problem?