google-deepmind / dsprites-dataset

Dataset to assess the disentanglement properties of unsupervised learning methods
Apache License 2.0
477 stars 69 forks source link

hdf5 version does not contain all the fields #4

Closed alexrakowski closed 4 years ago

alexrakowski commented 4 years ago

Although stated otherwise, the .hdf5 version of the dataset does not contain all the fields its .npy counterpart does, e.g. the metadata field is missing. Working on the numpy file isn't convenient as it's easy to encounter memory issues because of the need to load the whole dataset at once.

Azhag commented 4 years ago

Hello, thanks for your message!

I packed the metadata into HDF5 attributes, as it seemed like the best way to do that. Please let me know if something is missing, as I'm not actively using the hdf5 version myself unfortunately.

It's not extremely easy to find out, but please use the following snippet to see where everything is:

dataset_hdf5 = h5py.File('dsprites_ndarray_co1sh3sc6or40x32y32_64x64.hdf5', 'r')
def print_attrs(name, obj):
  print("=================")
  print("Name:", name)
  print("Data:", obj)
  print("Attributes:")
  for key, val in obj.attrs.iteritems():
    print("\t{}: {}".format(key, val))

dataset_hdf5.visititems(print_attrs)

Output for me:

=================
Name: imgs
Data: <HDF5 dataset "imgs": shape (737280, 64, 64), type "|u1">
Attributes:
    date: April 2017
    description: Disentanglement test Sprites dataset.Procedurally generated 2D shapes, from 6 disentangled latent factors.This dataset uses 6 latents, controlling the color, shape, scale, rotation and position of a sprite. All possible variations of the latents are present. Ordering along dimension 1 is fixed and can be mapped back to the exact latent values that generated that image.We made sure that the pixel outputs are different. No noise added.
    version: 1
    author: lmatthey@google.com
    title: dSprites dataset
=================
Name: latents
Data: <HDF5 group "/latents" (2 members)>
Attributes:
    names: ['color' 'shape' 'scale' 'orientation' 'posX' 'posY']
    possible_values_orientation: [0.         0.16110732 0.32221463 0.48332195 0.64442926 0.80553658
 0.96664389 1.12775121 1.28885852 1.44996584 1.61107316 1.77218047
 1.93328779 2.0943951  2.25550242 2.41660973 2.57771705 2.73882436
 2.89993168 3.061039   3.22214631 3.38325363 3.54436094 3.70546826
 3.86657557 4.02768289 4.1887902  4.34989752 4.51100484 4.67211215
 4.83321947 4.99432678 5.1554341  5.31654141 5.47764873 5.63875604
 5.79986336 5.96097068 6.12207799 6.28318531]
    possible_values_posX: [0.         0.03225806 0.06451613 0.09677419 0.12903226 0.16129032
 0.19354839 0.22580645 0.25806452 0.29032258 0.32258065 0.35483871
 0.38709677 0.41935484 0.4516129  0.48387097 0.51612903 0.5483871
 0.58064516 0.61290323 0.64516129 0.67741935 0.70967742 0.74193548
 0.77419355 0.80645161 0.83870968 0.87096774 0.90322581 0.93548387
 0.96774194 1.        ]
    possible_values_posY: [0.         0.03225806 0.06451613 0.09677419 0.12903226 0.16129032
 0.19354839 0.22580645 0.25806452 0.29032258 0.32258065 0.35483871
 0.38709677 0.41935484 0.4516129  0.48387097 0.51612903 0.5483871
 0.58064516 0.61290323 0.64516129 0.67741935 0.70967742 0.74193548
 0.77419355 0.80645161 0.83870968 0.87096774 0.90322581 0.93548387
 0.96774194 1.        ]
    possible_values_scale: [0.5 0.6 0.7 0.8 0.9 1. ]
    possible_values_shape: [1. 2. 3.]
    possible_values_color: [1.]
    sizes: [ 1  3  6 40 32 32]
=================
Name: latents/classes
Data: <HDF5 dataset "classes": shape (737280, 6), type "<i8">
Attributes:
=================
Name: latents/values
Data: <HDF5 dataset "values": shape (737280, 6), type "<f8">
Attributes:
alexrakowski commented 4 years ago

Thanks for the quick answer! So the field I was missing was latents_sizes - I assume it is the same as h5_file['latents'].attrs['sizes']

As per this application: https://github.com/google-research/disentanglement_lib/blob/a070670e6589ed45c0b0bd95ea8913e7067eb3ae/disentanglement_lib/data/ground_truth/dsprites.py#L64

Azhag commented 4 years ago

Yes exactly, this is the latents_sizes array :)