Tracking failures for WGAN-GP?

BarclayII commented 7 years ago

I wonder if the tips in https://github.com/soumith/ganhacks work under WGAN-GP as well, namely those in Sec. 10. Specifically I would like to confirm if the following is correct:

D loss is a large negative value: failure mode.
If loss of G steadily decreases (or D(G(z)) steadily increases), then G is fooling D with garbage.

igul222 commented 7 years ago

Those don't typically apply. You in fact want D loss to be a large negative value within the first ~1000 iters, and then want it to increase steadily (ie move closer to 0) over time. Looking at the G loss probably isn't useful.

From: Gan Quan notifications@github.com Sent: Tuesday, August 15, 2017 12:11:37 PM To: igul222/improved_wgan_training Cc: Subscribed Subject: [igul222/improved_wgan_training] Tracking failures for WGAN-GP? (#44)

I wonder if the tips in https://github.com/soumith/ganhacks work under WGAN-GP as well, namely those in Sec. 10. Specifically I would like to confirm if the following is correct:

D loss is a large negative value: failure mode.
If loss of G steadily decreases (or D(G(z)) steadily increases), then G is fooling D with garbage.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/igul222/improved_wgan_training/issues/44, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ABBP7hg_nRGIxqfEPDW2OB5fe_lc7mFdks5sYe1ogaJpZM4O3_Q8.

Kaede93 commented 4 years ago

@igul222

Hi, I'm working on a project using your WGAN algorithm, and I know why the Wasserstein loss can be negative according its definition.

But can you tell me what's the negative values stand for? Is that the Wasserstein loss has direction or something? Because when I training the WGAN, the Wasserstein loss sometimes oscillating around zero. For example, when the W loss are -0.05 and 0.05, are they have same performance? and how about -0.5 and 1?

One more thing, I tried to combine others loss with Wasserstein loss, such as the perceptual loss or SSIM (they always be positive number, and we want they be small values) to generator loss. In this situation, is the generator loss reasonable? Because the W loss can be negative and then although rest of losses getting larger, we can still get a more smaller value for generator loss, therefore I think the generator network will be confused and couldn't get converge.

Thank you for your time, and I'm looking forward to your help. Have a nice day!

igul222 / improved_wgan_training

Tracking failures for WGAN-GP? #44