simondemeule / IFT6269-project

0 stars 0 forks source link

Datasets to use #2

Closed smolPixel closed 3 years ago

smolPixel commented 3 years ago

Issue for the dataset. I think MNIST is a must because it's classic. The door number could be good to have too because I think they have been used quite a lot in generation. Then I feel like a complex one could be cool to compare to SOTA (maybe faces?). Ideally I think we would have 5 in a pipeline easy to run.

smolPixel commented 3 years ago

Rewatching the end of the class and I'm thinking it could be interesting to have one toy dataset where we try and match prefixed distributions instead of images, which could help in the analysis (eg generate dataset from a distribution in K dim and try to match it)

carlito387 commented 3 years ago

We will not use the celebrity faces dataset (unless we have much time in the end). We keep MNIST, CIFAR, SVHN (door numbers), customs (Gaussian + maybe mixture of Gaussians)