naturomics / DLF

Code for reproducing results in "Generative Model with Dynamic Linear Flow"
https://arxiv.org/abs/1905.03239
70 stars 13 forks source link

Imagenet dataset #2

Open yang-song opened 5 years ago

yang-song commented 5 years ago

Could you double check that your small imagenet datasets are the same as http://image-net.org/small/train_32x32.tar, and http://image-net.org/small/valid_32x32.tar? As far as I know different preprocessing on imagenet can greatly affect the likelihood you get. For example, using the imagenet 32x32 dataset from https://patrykchrabaszcz.github.io/Imagenet32/ easily yields a bpd around 3.80 for flow models. It is weird that your model has such a big advantage on imagenet, but not on CIFAR-10

naturomics commented 5 years ago

Could you double check that your small imagenet datasets are the same as http://image-net.org/small/train_32x32.tar, and http://image-net.org/small/valid_32x32.tar?

Yes, it's from there. I used the scripts from Glow repo to generate tfrecord files, therefore the imagenet dataset is same as in Glow.

using the imagenet 32x32 dataset from https://patrykchrabaszcz.github.io/Imagenet32/ easily yields a bpd around 3.80

How many iterations you run? 3.80 bits/dim is reasonable. You might have noticed, our results in the paper is reported within 50 epochs, it's not fully converged (I suddenly realized that I am so poor, I'm looking for an offer from those who can support me with hundreds GPUs, wow). It's welcome and thankful if you have available GPUs to train it with more iterations on the same dataset and feedback the results.

It is weird that your model has such a big advantage on imagenet, but not on CIFAR-10

Yes, it's weird. It's easy to get overfitting on CIFAR-10, we tried with smaller model or regularization but no improvement. We think it is because 1) CIFAR-10 is more blurry and less samples compared to imagenet 32x32; 2) As discussed in our paper, dynamic linear transformation learns to predict mu, scale for each input, which indicates we might need more data to cover the distribution of datasets.

naturomics commented 5 years ago

different preprocessing on imagenet can greatly affect the likelihood you get

Confirmed, you're right. I tested it on CelebA 256x256 and ImageNet 32x32 with this repo, by downsampling them (to 64x64 and 16x16, respectively) with different methods of tf.image.resize_image API, and observed about 0.5 bits/dim gap between different downsampling methods.

Our results reported in the paper were tested with the same preprocessing as in Glow, so it's no problem.