google-deepmind / dsprites-dataset

Dataset to assess the disentanglement properties of unsupervised learning methods
Apache License 2.0
477 stars 69 forks source link

File Damage, dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz #5

Closed supershiye closed 3 years ago

supershiye commented 3 years ago

I was running the example code and found errors at

# Load dataset
dataset_zip = np.load('dsprites_ndarray_co1sh3sc6or40x32y32_64x64.npz')

print('Keys in the dataset:', dataset_zip.keys())
imgs = dataset_zip['imgs']
latents_values = dataset_zip['latents_values']
latents_classes = dataset_zip['latents_classes']
metadata = dataset_zip['metadata'][()]

print('Metadata: \n', metadata)
Keys in the dataset: KeysView(<numpy.lib.npyio.NpzFile object at 0x000002275653D550>)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-6aadbe2135e4> in <module>
      6 latents_values = dataset_zip['latents_values']
      7 latents_classes = dataset_zip['latents_classes']
----> 8 metadata = dataset_zip['metadata'][()]
      9 
     10 print('Metadata: \n', metadata)

~\Anaconda3\envs\PyTorch\lib\site-packages\numpy\lib\npyio.py in __getitem__(self, key)
    258             if magic == format.MAGIC_PREFIX:
    259                 bytes = self.zip.open(key)
--> 260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
    262                                          pickle_kwargs=self.pickle_kwargs)

~\Anaconda3\envs\PyTorch\lib\site-packages\numpy\lib\format.py in read_array(fp, allow_pickle, pickle_kwargs)
    737         # The array contained Python objects. We need to unpickle the data.
    738         if not allow_pickle:
--> 739             raise ValueError("Object arrays cannot be loaded when "
    740                              "allow_pickle=False")
    741         if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

If I added 'allow_pickle=True' at np.load, then the error changed to

Keys in the dataset: KeysView(<numpy.lib.npyio.NpzFile object at 0x000001B37C0D8C10>)
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
~\Anaconda3\envs\PyTorch\lib\site-packages\numpy\lib\format.py in read_array(fp, allow_pickle, pickle_kwargs)
    743         try:
--> 744             array = pickle.load(fp, **pickle_kwargs)
    745         except UnicodeError as err:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position 9: ordinal not in range(128)

During handling of the above exception, another exception occurred:

UnicodeError                              Traceback (most recent call last)
<ipython-input-4-c9307969a90f> in <module>
      6 latents_values = dataset_zip['latents_values']
      7 latents_classes = dataset_zip['latents_classes']
----> 8 metadata = dataset_zip['metadata'][()]
      9 
     10 print('Metadata: \n', metadata)

~\Anaconda3\envs\PyTorch\lib\site-packages\numpy\lib\npyio.py in __getitem__(self, key)
    258             if magic == format.MAGIC_PREFIX:
    259                 bytes = self.zip.open(key)
--> 260                 return format.read_array(bytes,
    261                                          allow_pickle=self.allow_pickle,
    262                                          pickle_kwargs=self.pickle_kwargs)

~\Anaconda3\envs\PyTorch\lib\site-packages\numpy\lib\format.py in read_array(fp, allow_pickle, pickle_kwargs)
    746             if sys.version_info[0] >= 3:
    747                 # Friendlier error message
--> 748                 raise UnicodeError("Unpickling a python object failed: %r\n"
    749                                    "You may need to pass the encoding= option "
    750                                    "to numpy.load" % (err,))

UnicodeError: Unpickling a python object failed: UnicodeDecodeError('ascii', b'\x00\x00\x00\x00\x00\x00\x00\x00\x1a\xc9\xc1\x1d*\x9f\xc4?\x1a\xc9\xc1\x1d*\x9f\xd4?\xa7\xad\xa2,\xbf\xee\xde?\x1a\xc9\xc1\x1d*\x9f\xe4?a;2\xa5\xf4\xc6\xe9?\xa7\xad\xa2,\xbf\xee\xee?\xf7\x8f\t\xdaD\x0b\xf2?\x1a\xc9\xc1\x1d*\x9f\xf4?>\x02za\x0f3\xf7?a;2\xa5\xf4\xc6\xf9?\x83t\xea\xe8\xd9Z\xfc?\xa7\xad\xa2,\xbf\xee\xfe?fs-8R\xc1\x00@\xf7\x8f\t\xdaD\x0b\x02@\x88\xac\xe5{7U\x03@\x1a\xc9\xc1\x1d*\x9f\x04@\xac\xe5\x9d\xbf\x1c\xe9\x05@>\x02za\x0f3\x07@\xcf\x1eV\x03\x02}\x08@a;2\xa5\xf4\xc6\t@\xf3W\x0eG\xe7\x10\x0b@\x83t\xea\xe8\xd9Z\x0c@\x15\x91\xc6\x8a\xcc\xa4\r@\xa7\xad\xa2,\xbf\xee\x0e@\x1de?\xe7X\x1c\x10@fs-8R\xc1\x10@\xae\x81\x1b\x89Kf\x11@\xf7\x8f\t\xdaD\x0b\x12@@\x9e\xf7*>\xb0\x12@\x88\xac\xe5{7U\x13@\xd1\xba\xd3\xcc0\xfa\x13@\x1a\xc9\xc1\x1d*\x9f\x14@c\xd7\xafn#D\x15@\xac\xe5\x9d\xbf\x1c\xe9\x15@\xf5\xf3\x8b\x10\x16\x8e\x16@>\x02za\x0f3\x17@\x87\x10h\xb2\x08\xd8\x17@\xcf\x1eV\x03\x02}\x18@\x18-DT\xfb!\x19@', 9, 10, 'ordinal not in range(128)')
You may need to pass the encoding= option to numpy.load
Azhag commented 3 years ago

Thanks for the report, it's been a long while since I updated this so I'll check how to repack it for Python 3!

Azhag commented 3 years ago

As mentioned in #3 , you might have better luck with dataset_zip = np.load(file_path, encoding='latin1')

supershiye commented 3 years ago

Thank you for you quick response. I have tried adding both allow_pickle and encoding and it went through.