lmcinnes / umap

Uniform Manifold Approximation and Projection
BSD 3-Clause "New" or "Revised" License
7.45k stars 808 forks source link

ValueError: No data provided for "umap". Need data for each key in: ['reconstruction', 'umap'] #610

Open maths-qureshi opened 3 years ago

maths-qureshi commented 3 years ago

When running the default script i.e. 03.0-parametric-umap-mnist-embedding-convnet-with-reconstruction.ipynb. I get this error when trying to train i.e. fit_transform. Training only with the encoder works though. Here is the output: WARNING:tensorflow:AutoGraph could not transform <function umap_loss..loss at 0x000002862FEAF620> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Cell is empty WARNING: AutoGraph could not transform <function umap_loss..loss at 0x000002862FEAF620> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, export AUTOGRAPH_VERBOSITY=10) and attach the full output. Cause: Cell is empty

KeyError Traceback (most recent call last) c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 505 if data[x].class.name == 'DataFrame' else data[x] --> 506 for x in names 507 ]

c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py in (.0) 505 if data[x].class.name == 'DataFrame' else data[x] --> 506 for x in names 507 ]

KeyError: 'umap'

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)

in ----> 1 embedding = embedder.fit_transform(train_images) c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\umap\umap_.py in fit_transform(self, X, y) 2632 Local radii of data points in the embedding (log-transformed). 2633 """ -> 2634 self.fit(X, y) 2635 if self.transform_mode == "embedding": 2636 if self.output_dens: c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\umap\umap_.py in fit(self, X, y) 2552 if self.transform_mode == "embedding": 2553 self.embedding_, aux_data = self._fit_embed_data( -> 2554 self._raw_data[index], n_epochs, init, random_state, # JH why raw data? 2555 ) 2556 # Assign any points that are fully disconnected from our manifold(s) to have embedding c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\umap\parametric_umap.py in _fit_embed_data(self, X, n_epochs, init, random_state) 362 max_queue_size=100, 363 validation_data=validation_data, --> 364 **self.keras_fit_kwargs 365 ) 366 # save loss history dictionary c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs) 817 max_queue_size=max_queue_size, 818 workers=workers, --> 819 use_multiprocessing=use_multiprocessing) 820 821 def evaluate(self, c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in fit(self, model, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs) 233 max_queue_size=max_queue_size, 234 workers=workers, --> 235 use_multiprocessing=use_multiprocessing) 236 237 total_samples = _get_total_number_of_samples(training_data_adapter) c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_training_inputs(model, x, y, batch_size, epochs, sample_weights, class_weights, steps_per_epoch, validation_split, validation_data, validation_steps, shuffle, distribution_strategy, max_queue_size, workers, use_multiprocessing) 612 class_weights=class_weights, 613 steps=validation_steps, --> 614 distribution_strategy=distribution_strategy) 615 elif validation_steps: 616 raise ValueError('`validation_steps` should not be specified if ' c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py in _process_inputs(model, mode, x, y, batch_size, epochs, sample_weights, class_weights, shuffle, steps, distribution_strategy, max_queue_size, workers, use_multiprocessing) 644 standardize_function = None 645 x, y, sample_weights = standardize( --> 646 x, y, sample_weight=sample_weights) 647 elif adapter_cls is data_adapter.ListsOfScalarsDataAdapter: 648 standardize_function = standardize c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training.py in _standardize_user_data(self, x, y, sample_weight, class_weight, batch_size, check_steps, steps_name, steps, validation_split, shuffle, extract_tensors_from_dataset) 2381 is_dataset=is_dataset, 2382 class_weight=class_weight, -> 2383 batch_size=batch_size) 2384 2385 def _standardize_tensors(self, x, y, sample_weight, run_eagerly, dict_inputs, c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training.py in _standardize_tensors(self, x, y, sample_weight, run_eagerly, dict_inputs, is_dataset, class_weight, batch_size) 2467 shapes=None, 2468 check_batch_axis=False, # Don't enforce the batch size. -> 2469 exception_prefix='target') 2470 2471 # Generate sample-wise weight values given the `sample_weight` and c:\programdata\anaconda3\envs\tf2gpu\lib\site-packages\tensorflow_core\python\keras\engine\training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix) 508 except KeyError as e: 509 raise ValueError('No data provided for "' + e.args[0] + '". Need data ' --> 510 'for each key in: ' + str(names)) 511 elif isinstance(data, (list, tuple)): 512 if isinstance(data[0], (list, tuple)): ValueError: No data provided for "umap". Need data for each key in: ['reconstruction', 'umap'] plot reconstructions test_images_recon = embedder.inverse_transform(embedder.transform(test_images)) import numpy as np nex = 10 fig, axs = plt.subplots(ncols=10, nrows=2, figsize=(nex, 2)) for i in range(nex):
BEpresent commented 3 years ago

Same problem here on Linux - the default examples don't work.

WARNING:tensorflow:AutoGraph could not transform <function umap_loss.<locals>.loss at 0x7f8fcc547ef0> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: Cell is empty

It seems that some 'umap' dictionary entry is missing

ValueError: No data provided for "umap". Need data for each key in: ['reconstruction', 'umap']

In parametric_umap.py there is a line for the validation data where a 'reconstruction' entry is created, but no 'umap' entry, could this be related?

    validation_data = (
        (
            self.reconstruction_validation,
            tf.zeros_like(self.reconstruction_validation),
        ),
        {"reconstruction": self.reconstruction_validation},
    )