google-deepmind / dsprites-dataset

Dataset to assess the disentanglement properties of unsupervised learning methods
Apache License 2.0
477 stars 69 forks source link

Normal that shapes look distorted? #3

Closed Spenhouet closed 5 years ago

Spenhouet commented 5 years ago

I'm wondering if I made a mistake when loading the dataset or if it is normal that the shapes look distorted?

Examples:

image image image

I'm loading these from the dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz file like this:

file_name = 'dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz'
dataset_zip = np.load(os.path.join(data_dir, 'dSprites', file_name), encoding='latin1')
images = np.reshape(dataset_zip['imgs'], (num_samples, size, size, channels))
...

And than yielding these via the dataset API.

Azhag commented 5 years ago

Heya!

Yes, unfortunately these are rastering artefacts from rendering rotated shapes at low resolution. There was no easy way to perform cleaner antialiasing unfortunately, so this may happen. Even though the shapes were defined as vectors (and could have infinite precision), when rendered to raster images using LuaLove I did not first upscale -> render -> downscale, so the render lost information. As this dataset is binary, there is no way around that issue. We could render in RGB/grayscale instead, but that would be for another dataset :)

If you have a look at the demo notebook, the same happens, but might be less pronounced for other shapes. Can you confirm that shapes that are not rotated look ok?

(As an aside, in my usage I did not need to specify the encoding for np.load, but that may differ with python version?)

Spenhouet commented 5 years ago

Hi

thank you for your fast reply.

Good to know. I was really expecting that I did make a loading mistake somewhere. Thank you for confirming that this is how it is supposed to be. Yes, shapes that are not rotated look fine. :)

I'm using python 3 and when no encoding is specified the loading fails. The numpy load command documentation states:

encoding ... Only useful when loading Python 2 generated pickled files in Python 3 ...

Thanks again for the fast clarification.