GAN

Links

Towards Data Science
Synthesizing Tabular Data using Generative Adversarial Networks
Modeling Tabular Data using Conditional GAN
CTAB-GAN: Effective Table Data Synthesizing
CTAB-GAN: github
Data Synthesis based on Generative Adversarial Networks
Thoughts
Maximilian:
- Using a conditional GAN approach seems kind of promising. In the end, the trained network could be given a conditional vector containing ingredients for which a recipe should be generated.
- If I have understood the papers correctly, the networks are trained to generate a single row of a certain table representation within one run. Matching this to our recipe problem, the ingredient list for a recipe might have to be represented by a table row
- CTAB-GAN is the latest of these papers. They mention to outperform the other approaches. They even mention how to handle empty values, which could help us to generate ingredient lists of various length.
Marcel:
- I agree that conditional GAN sounds / looks pretty promising. However im still concerned about the fact if we can transform our recipe data into a meaningful table representation. When looking at the example of CTAB-GAN page 4 we would need one hot encoding for all possible ingredients for a variable amount of repeating columns.
  VAE
  
  Links
- Towards Data Science
- Wikipedia VAE
- Wikipedia Autoencoder
- Youtube Video
- Robust Variational Autoencoder for Tabular Data with β Divergence
- Robust Variational Autoencoder
- Conditional Variational Autoencoder

Marcel:
- In addition to @mscholl96 comment for the GANs the same can be applied for VAE. With conditional VAE, where the latent space representation will be constrained by another variable to influence the data which is generated (see e.g. CVAE).
- From my feeling so far the VAE could fit even better than GAN's. Since from my understanding when inputting ingredients lists of different length there will be a latent space representation of different regions with different lengths and different recipe combinations. The latent space then would also represent that e.g. a recipe with a certain ingredient is typically longer than others.

Could be usefull for the costrain of either the GAN or VAE. We could input a single vector of all available ingredients.