GeorgeSeif / Semantic-Segmentation-Suite

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
2.5k stars 880 forks source link

cant load checkpoint files #225

Open skywo1f opened 4 years ago

skywo1f commented 4 years ago

tried all of the files in the checkpoints folder: model.ckpt.index model.ckpt.meta checkpoint model.ckpt.data-00000-of-00001 none of them work: Semantic-Segmentation-Suite/checkpoints/0295/model.ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator

sweetboxwwy commented 4 years ago

Have you solve this problem?

AI-ML-Enthusiast commented 4 years ago

@skywo1f @GeorgeSeif same problem. Anyone suggest me please. I trained on my PC , training is Ok but can not load the checkpoint file

millermuttu commented 4 years ago

same issue -

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key BatchNorm/beta not found in checkpoint [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]] [[Node: save/RestoreV2/_79 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_84_save/RestoreV2", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

FZY2019 commented 4 years ago

Is the problem solved?

skywo1f commented 4 years ago

no

Imperssonator commented 4 years ago

Had the same issue - in your case, you should just pass model.ckpt https://stackoverflow.com/questions/33759623/tensorflow-how-to-save-restore-a-model Also if you're not using CamVid, make sure to pass something to --dataset, otherwise it will default to the 32 class labels from the CamVid dataset. Hope that helps.

wy9884255 commented 4 years ago

you better try this disk:/your_folder/model.ckpt

Harikrishnan24 commented 4 years ago

Is this problem solved

mtylerpreston commented 4 years ago

I think I have the solution...just use model.ckpt even if no such file exists.

I had the same struggle in trying to pass model.ckpt.meta etc in for resuming the train. Even though no file name exists in the directory where the checkpoint was specified to be saved, I just used model.ckpt and it worked out.

Harikrishnan24 commented 4 years ago

Thank you for your help

On Fri, May 15, 2020, 10:59 PM M. Tyler Preston notifications@github.com wrote:

I had the same struggle. Even though no file name exists in the directory where the checkpoint was specified to be saved, I just used model.ckpt and it worked out.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/GeorgeSeif/Semantic-Segmentation-Suite/issues/225#issuecomment-629385402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEVXEBDB5VG3K656S435TWTRRV3YFANCNFSM4JANDEMQ .