Image-to-Image Translation with Conditional Adversarial Networks

nikiladonya commented 7 years ago

Hello! Thanks for the great tool. But there is one question. Is it possible to implement this idea (https://arxiv.org/pdf/1611.07004v1.pdf) with the help of your library?

bstriner commented 7 years ago

Of course! keras-adversarial is just a tool for combining multiple keras models into a single training function and a single call to the GPU. If you can build the separate models in keras, then you can use keras-adversarial to train them simultaneously, track the combined metrics, etc.

Those drawing are really pretty but I feel like the architecture is going to change radically with all the new work on Wasserstein GANs. You could build a Wasserstein GAN in keras-adversarial but the research hasn't settled down yet.

Anyways, just build the overall model, create separate lists of parameters for each player, then compile and train it. If you have some specific questions I would be happy to help but it should be relatively straightforward.

You may want to do something like using the Keras ImageDataGenerator to feed images into fit_generator.

Cheers

JaneyWit commented 7 years ago

@bstriner could you provide some code snippets to demonstrate how to use keras-adversarial in the case where the goal is semantic segmentation as e.g. https://arxiv.org/abs/1611.08408 . It's not clear to me how I provide training input to a model created with simple_gan(generator, discriminator, None)

In examples/gan_convolutional.py, supposing that the generator model is changed to expect input samples (unsegmented images) with size (1, 28, 28) and output segmentations with the same size.

When I call model.fit, how do I give input to both the generator (i.e. unsegmented images) and the discriminator (a mixture of generator output and true segmentations) ?

Thank you

bstriner commented 7 years ago

simple_gan is a helper if you're doing something simple. If you're doing something different, build Keras models with all of the inputs and outputs of the right shapes and build an adversarial model out of them. However, from what you said, I don't see why the simple GAN doesn't apply.

You do not train the discriminator individually. A naive approach is to generate data and then pass it to the discriminator but that is a waste. Build a combined model where your discriminator receives generator output and real samples. The input to the combined model is latent samples and real samples, not generated and real samples.

The confusion comes from a lot of examples where you generate fake data, concatenate it with real data, and train the discriminator on the combined data. It is easier to have two inputs, apply a shared discriminator, and have two outputs (or subtract them and have one output). This is exactly the same except in edge cases like using batchnorm, but batchnorm doesn't work with discriminators anyway.

simple_gan is a short function that links the generator and discriminator together as described above, but isn't perfect for all models.

So, imagine you build that combined model out of combining keras models. You then pass the combined model to the AdversarialModel constructor, and pass the separated weights as the player_params.

When I call model.fit, how do I give input to both the generator (i.e. unsegmented images) and the discriminator (a mixture of generator output and true segmentations) ?

Simplest version: build a combined model with inputs: latent samples and real samples. The generator produces fake samples. The discriminator is applied once to the real samples and once to the fake samples so the model has two outputs. Pass this model to AdversarialModel, and it will build the adversarial version, which has four targets (2 for each player). Pass the inputs and targets so one player tries to push the outputs one way and the other goes the other direction.

The combined model should have all inputs and all outputs. This is because at each batch, the model is collecting all of the metrics. Depending on the optimizer you use, it might only train one player or another player, but a call to train_on_batch has to return all of the metrics every time so it works with the rest of Keras.

The targets that you give and the configuration of the optimizer determine what parts of the model are trained when and towards what objective but the easiest configuration is a single combined model.

I'm not going implement semantic segmentation but if you have questions or need help, sure.

Cheers

JaneyWit commented 7 years ago

Thanks @bstriner that's very helpful. The problem was indeed that I was accustomed to the naive examples where the discriminator is trained separately. I think I have grasped what I need to do :-)

bstriner commented 7 years ago

@JaneyWit No problem. If you have a single GPU call that generates fakes and trains on them you should see a performance boost over generating fakes on the GPU, concatenating with real data on the CPU, then calling the GPU again.

JaneyWit commented 7 years ago

@bstriner, I'm a bit confused by the names assigned to the standard gan_targets (1,0,0,1) (gfake, greal, dfake dreal).
As I understand it the idea is that when the generator is being trained (and discriminator weights held fixed) the target should always be "real" and when the discriminator is being trained (and generator weights held fixed) the target should always be "fake". I would expect therefore that gfake,greal should equate to (0,1) and dfake, dreal to (1,0) . Is this just a naming/ordering issue, or is my understanding flawed? Many thanks.

bstriner commented 7 years ago

gfake, greal, dfake and dreal are all discriminator output targets. (gfake, dfake) and (greal, dreal) are the same output tensors and loss functions with different targets, one for generator one for discriminator.

Those values are the ytrue used in binary_crossentropy. Traditionally, discriminators try to give 1 to real and 0 to fake, so the loss is binary_crossentropy(0, yfake)+binary_crossentropy(1,yreal). Generator loss is the inverse of that.

That being said, this is just the traditional formulation. If you switch the values everything will still work fine. There is no reason why the discriminator shouldn't try to give 0 to real and 1 to fake. Think of it like a bit. There is no reason whether high or low should be 0 or 1 and you can make a system either way.

As long as the generator targets are the opposite of the discriminator targets, you should be able to get similar behavior.

Generator weights are updated using the losses from gfake and greal. Discriminator weights use dfake and dreal.

The discriminator has to learn to discriminate. When you train the discriminator, the fake values should be near 0 and the real values should be near 1. That is why dfake, dreal is (0, 1).

When you train the generator, it is trying to confuse the discriminator into making fake values near 1. That is why gfake is 1. Setting greal to 0 provides a nice little normalization calculation. The derivative of the real values w/r/t the generator is nothing (because the generator only affects generated values). greal doesn't actually affect learning but makes everything nice and symmetric.

Cheers

JaneyWit commented 7 years ago

@bstriner - thanks. I realised from reading your post that I didn't really understand how the output was structured. I thought that (gfake, greal) was a pair indicating probabilities of the sample being fake/real respectively (which of course doesn't make sense as targets!) . I get it now, so thanks!

rjpg commented 6 years ago

did not understand "gfake, greal" something from the generator is always fake right ?

the generator is trained with the complete model G->D->ouput ... G update the weights with the "difference" from what it has produced and what D was expecting ... G knows what D expects through bprop. I think (?)

bstriner / keras-adversarial

Image-to-Image Translation with Conditional Adversarial Networks #18