minerva-ml / minerva-training-materials

Learn advanced data science on real-life, curated problems
https://neptune.ml/minerva
MIT License
48 stars 14 forks source link

strange things during dryrun on fashion_mnist #7

Closed buus2 closed 6 years ago

buus2 commented 6 years ago

When I run

python run_minerva.py -- dry_run --problem fashion_mnist

I receive

2018-01-09 16-19-22 minerva-whales >>> starting experiment...
Using TensorFlow backend.
2018-01-09 16-19-23 minerva-whales >>> running: None
neptune: Executing in Offline Mode.
2018-01-09 16-19-23 minerva-whales >>> Saving graph to /mnt/ml-team/minerva/cache/whales/new_experiment/alignment/class_predictions_graph.json
2018-01-09 16-19-24 minerva-whales >>> step input unpacking inputs
2018-01-09 16-19-24 minerva-whales >>> step input saving transformer...
2018-01-09 16-19-24 minerva-whales >>> step input saving outputs...
2018-01-09 16-19-24 minerva-whales >>> step keras_model unpacking inputs
Epoch 1/200
2018-01-09 16:19:25.148275: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148334: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148367: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148394: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-09 16:19:25.148421: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
46/47 [============================>.] - ETA: 1s - loss: 0.3966 - acc: 0.9724/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py:494: RuntimeWarning: Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,acc
  (self.monitor, ','.join(list(logs.keys()))), RuntimeWarning
Traceback (most recent call last):
  File "run_minerva.py", line 46, in <module>
    action()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "run_minerva.py", line 27, in dry_run
    pm.dry_run(sub_problem, train_mode, dev_mode, cloud_mode)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 16, in dry_run
    _evaluate(trainer)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/problem_manager.py", line 39, in _evaluate
    score_valid, score_test = trainer.evaluate()
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/trainer.py", line 22, in evaluate
    score_valid = self._evaluate(X_valid, y_valid)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/fashion_mnist/trainer.py", line 29, in _evaluate
    'inference': True}})
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 102, in transform
    step_inputs[input_step.name] = input_step.fit_transform(data)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 74, in fit_transform
    step_output_data = self._cached_fit_transform(step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 84, in _cached_fit_transform
    step_output_data = self.transformer.fit_transform(**step_inputs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/base.py", line 206, in fit_transform
    self.fit(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/models_keras.py", line 28, in fit
    **self.training_config)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/engine/training.py", line 2187, in fit_generator
    callbacks.on_epoch_end(epoch, epoch_logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
    callback.on_epoch_end(epoch, logs)
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva/minerva/backend/models/keras/callbacks_keras.py", line 21, in on_epoch_end
    self.ctx.channel_send('Log-loss validation', self.epoch_id, logs['val_loss'])
KeyError: 'val_loss'
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit
Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7fc4dd75bdd8>>
Traceback (most recent call last):
  File "/home/patryk/Documents/edukacyjne/Minerva/0401/minerva_venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 595, in __del__
TypeError: 'NoneType' object is not callable

Three things are strange here:

  1. minerva-whales in the first line, although I run fashion_mnist problem.
  2. It seems it performs training despite that I don't use --train_mode and the train mode is off by default.
  3. We can see an error about val_acc which is unavailable.
jakubczakon commented 6 years ago

Thanks for pointing that out. In fashion_mnist there is no saved model hence it needs to train it anyways.

I will fix the rest right away.

jakubczakon commented 6 years ago

It should be solved now