GarlandZhang / hairy_gan

1 stars 0 forks source link

cgan for hair colour #2

Closed GarlandZhang closed 4 years ago

GarlandZhang commented 4 years ago
  1. modify existing gan to use conditional generators and discriminators.
  2. ideally we want binary vector input representing combination of inputs. we extract these labels from the property columns and only getting whether or not they are 0 or 1 (in our case, it's -1 or 1).
  3. training will be using these binary vector inputs

problem: given 40 features, we have 2^40 (approx. 1 trillion) combinations... this is too many types to train on.

Instead, we can train on one feature at a time and encode its feature label as the order it is indexed as. For example, if blond hair was the 27th property, then we use the label 27.

For training, since we have multiple features for each image, we can train it on one feature at a time. For example, suppose we are only interested in turning any person's hair color to blond. then we can simply ignore the other features and just look at the blond hair column for all the inputs. Once the generator becomes good at generating blond hair, we can work on a different feature afterwards.

problem 2: what happens if we have an image that doens't have blond hair? Then we have a -1 property value. Then the generator would generate what? Whereas with digits, we can get away with knowing every input value is something we can generate, non-blond hair isn't. At the very least, we should define it more clearly to be, for example, black hair as default.

Since hair color is mutually exclusive, every image in the dataset is true for one of those values. Then we should find the corresponding index for the hair color it belongs to so the generator can actually train on a real hair color (instead of potentially returning gibberish non-blond hair).

problem 3: with digits, our cgan simply took a latent space and label, fed it to a generator, and outputted (ideally) a synthetic representation of the label. Since we are working with images, however, what do we expect the generator to output exactly? If we feed it a label "blond hair" then i expect the model to output an image based off the input with blond hair! So what is the discriminator outputting? Recall the discriminator determines if an output is real or fake (but not interested in the actual value itself). With digits, we pass an image and corresponding label to the discriminator and return 0 (fake) or 1 (real). The discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or vice versa.

Similarly, with face images, we pass the transformed face image (or original) and corresponding label (property index) to the discriminator and return 0 (fake) or 1 (real). And similarly, the discriminator learns by being penalized for thinking the image is fake, when in fact it is real, or the image is transformed (say a black haired person is not a blond haired person) and the ddiscriminator thinks it is real.

So, when we pass real images, we expect the discriminator to output 1. when we pass fake images, we expect the discriminator to output 0. the generator learns by not creating well-represented images.

Summary of training:

input: face image + label (property index) output: 0 or 1 for fake or real

one last problem... whereas with digits with have mutual exclusiveness. Now we don't. So the discriminator might initially learn to believe these classes are mutually exclusive (? this might not be true but im just considering it).