wiseodd / generative-models

Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.
http://wiseodd.github.io
The Unlicense
7.32k stars 2.04k forks source link

Question about one hot datasets #45

Closed thiscantbetaken closed 6 years ago

thiscantbetaken commented 6 years ago

Thanks for making these generative models available, they are very interesting.

I have a one hot dataset that I would like to test with these various models, and I am curious what your recommendation would be for converting the existing MNIST dataset import to a dataset of one hot encoded values? I have a flat file that contains 30K lines, each line is one hot encoded with a total of 64 labels, resulting in 5,632 input values/neurons per line.

I was under the mistaken impression that it would be a seamless transition based on the one_hot=True import directive from TensorFlow, only to figure out that the MNIST dataset is float32.

Will I have to use a different loss function with your code to support simple integer valued one hot data?

Thanks in advance

wiseodd commented 6 years ago

GAN won't work with discrete data as it's non-differentiable. You might want to try VAE instead. Or if you insist on GAN, you might want to review some papers about GAN in NLP domain. They've got some tricks to make GAN works with discrete data.