1-lipschitz penalty - Githubissues

529261027 commented 2 years ago

hi, thanks for your code, when I read the code, I have a little question, when you calculate gp_loss. The original paper is written 1-lipschitz penalty should (gp_loss-1)^2, but your code is only gp_loss. I understand that you limit the loss to small and meet the requirements. I want to know, is the result obtained by the experiment? Looking forward to your reply.

arunppsg commented 2 years ago

Yes, it should be ((gp_loss - 1)^2) * lambda but for my implementation, I used gp_loss.

From the best of my understanding, gradient penalty loss is used for minimizing the problem of exploding and vanishing gradient. When I was training the model, I had the problem of exploding gradients due to a very high value of loss. To contain the value of loss, I used just gp_loss, because gp_loss was less than lambda*(gp_loss - 1)^2 and hence the magnitude of loss will be low. This solved the problem of exploding gradient to a certain extent. Let me know if you achieve stability by using the loss as in paper and we can update it with the same in repo.

529261027 commented 2 years ago

I understood, thanks

arunppsg / TadGAN

1-lipschitz penalty #10