Closed YuanXun2024 closed 1 year ago
Hi,
Thanks for your interest.
The term 'GP' stands for 'gradient penalty'. You should be able to retrieve its full expression from line 276:
gradient_penalty = netD.calc_gradient_penalty(real_data_v.data, fake.data, real_y, L_gp, device)
The gradient_penalty should be enforced on the full batch gradient, making the full batch gradient have a norm close to CLIP_BOUND (I recall I checked this with preliminary experiments). Therefore, the "per-example gradient norm" should correspond to the value of CLIP_BOUND divided by the batch size.
I hope this clarification helps. Please let me know if there are any remaining uncertainties.
That helps a lot. Thank you!
Hi Dingfan,
I am not very clear on the 67th line in main.py. Why CLIP_BOUND is divided by batch size? And what's the full expression of 'GP'?
Thank you!