Cerenaut / sparse-unsupervised-capsules

Sparse unsupervised capsules
https://arxiv.org/abs/1804.06094
Apache License 2.0
23 stars 5 forks source link

Data loss: not an sstable (bad magic number) when running classification on google colab #2

Closed orbennatan closed 5 years ago

orbennatan commented 5 years ago

I copied the whole project to my google drive, mounted the drive on google colab and ran the program according to the read me file. Training went fine and produced 30000 checkpoint files, so I assume it went OK with the following cell: !python3 experiment.py --data_dir="/content/drive/My Drive/Colab Notebooks/MNISTForColab" --summary_dir="/content/drive/My Drive/Colab Notebooks/MNISTForColab" --max_steps=30000 --dataset=mnist --batch_size=128 --shift=2

Next I ran the following command: !python3 experiment.py --data_dir="/content/drive/My Drive/Colab Notebooks/MNISTForColab" --train=False --checkpoint="/content/drive/My Drive/Colab Notebooks/MNISTForColab/summary_190106/190106-1413/train/model.ckpt-30000.data-00000-of-00001" --summary_dir="/content/drive/My Drive/Colab Notebooks/MNISTForColab" --eval_set=test --eval_size=80000 --eval_shard=0 And received the following error: 2019-01-07 15:59:26.726068: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /content/drive/My Drive/Colab Notebooks/MNISTForColab/summary_190106/190106-1413/train/model.ckpt-30000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? 2019-01-07 15:59:26.730417: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open /content/drive/My Drive/Colab Notebooks/MNISTForColab/summary_190106/190106-1413/train/model.ckpt-30000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? 2019-01-07 15:59:26.730528: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at save_restore_tensor.cc:175 : Data loss: Unable to open table file /content/drive/My Drive/Colab Notebooks/MNISTForColab/summary_190106/190106-1413/train/model.ckpt-30000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? Would be great if you can shed some light on the source of the problem. I would love to contribute the notebooks later for public use. You may send replies directly to or.bennatan@gmail.com . Thank you.

abdel commented 5 years ago

@orbennatan You want to use name of the checkpoint by itself, i.e. --checkpoint=/path/to/model.ckpt-30000, rather than --checkpoint=/path/to/model.ckpt-30000.data-00000-of-00001. TensorFlow automatically picks up the data, index and meta checkpoint files tied to that name.

orbennatan commented 5 years ago

Tried the suggestion and it fixed this particular problem. Still not running to completion but will open another issue if necessary. Thank you so much for the quick response