tensorflow / models

Models and examples built with TensorFlow
Other
76.97k stars 45.79k forks source link

Data loss: not an sstable (bad magic number) #2675

Closed wpq3142 closed 6 years ago

wpq3142 commented 6 years ago

tensorflow.python.framework.errors_impl.DataLossError: Unable to open table file /home/wpq/data/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

scotthuang1989 commented 6 years ago

I got same error when I running object detection API.

wpq3142 commented 6 years ago

It seems that the downloaded pre-training file does not match the model

wpq3142 commented 6 years ago

File format is inconsistent,Look at posts: http://votec.top/2016/12/24/tensorflow-r12-tf-train-Saver/

slim.get_or_create_global_step() change to: tf.train.get_or_create_global_step()

asimshankar commented 6 years ago

I apologize but I am having a hard time understanding what the problem is, where the problem is, and what version it affects. Please resubmit and pay attention to the issue template (https://github.com/tensorflow/tensorflow/issues/new) . Please provide all the information it asks. Thank you.

Particularly, telling us exactly what you did will help. There are many models in this repository, so I'm not quite sure which one you're talking about. Though from @scotthuang1989 it appears its the object detection API?

Could you elaborate on the sequence of steps that reproduces the problem?

tombstone commented 6 years ago

duplicate of https://github.com/tensorflow/models/issues/2676

watchpoints commented 6 years ago

Exporting a trained model for inference After your model has been trained, you should export it to a Tensorflow graph proto. A checkpoint will typically consist of three files:

model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001 model.ckpt-${CHECKPOINT_NUMBER}.index model.ckpt-${CHECKPOINT_NUMBER}.meta

python object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path /app/tf_object_detection_api/config/faster_rcnn_inception_v2_pets.config \
    --trained_checkpoint_prefix /app/tf_object_detection_api/models/model.ckpt-306 \
    --output_directory /app/tf_object_detection_api/models/faster_rcnn_inception_v2_pets
chenjun2hao commented 5 years ago

@aleafboat ,thanks for your advice, i work it. the original code of mine is

python export_inference_graph.py --input_type image_tensor --pipeline_config_path ./dataset_tools/voc2007/models/ssd_mobilenet_v1/ssd_mobilenet_v1_coco.config --trained_checkpoint_prefix ./dataset_tools/voc2007/output/model.ckpt-5060* --output_directory ./dataset_tools/voc2007/pb_model/ i just remove the '*', at the end of 5060.

shellyfung commented 5 years ago

I have fixed the issue by this: replace model.ckpt the model.ckpt-200000 where 20000 is your checkpoint number

codexponent commented 5 years ago

Solved on #7696

jpgochile commented 5 years ago

Hi, (Apr/2019), the same error, but the cause is an aborted train process. I was re-train the model, and then try again with:

$cd bert-master/bert_output $ $python ./run_classifier.py --task_name=cola --do_train=true --do_eval=true --data_dir=./data --vocab_file=$BERT_BASE_DIR/vocab.txt --bert_config_file=$BERT_BASE_DIR/bert_config.json --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt --max_seq_length=128 --train_batch_size=32 --learning_rate=2e-5 --num_train_epochs=1.0 --output_dir=./bert_output/ --do_lower_case=False

Rajamohanreddyai commented 5 years ago

Hello all, just follow the below video and export your own model with in a 10 seconds

https://youtu.be/w0Ebsbz7HYA

chjose commented 5 years ago

Got the same error while importing the model. Fixed it by providing the right path.

V2 Saver from TF generates 3 files: model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001 model.ckpt-${CHECKPOINT_NUMBER}.index model.ckpt-${CHECKPOINT_NUMBER}.meta

In restore give the path till CHECKPOINT_NUMBER, like below:

saver.restore(sess, "model.ckpt-${CHECKPOINT_NUMBER}")

Hope that make sense.

pinzhi000 commented 2 years ago

I was able to resolve this issue by saving a .h5 file directly

Link: https://www.tensorflow.org/guide/keras/save_and_serialize#:~:text=The%20recommended%20format%20is%20SavedModel,'h5'%20to%20save()%20.

charles-xyz commented 7 months ago

if you are trying to do this in 2024 and it doesn't work try this: root-path-to-model-checkpoint-storage-here/ckpt-n put the checkpoint number you want instead of n, works for me!