DrSleep / tensorflow-deeplab-resnet

DeepLab-ResNet rebuilt in TensorFlow
MIT License
1.25k stars 429 forks source link

Problem running the train script #136

Closed George3d6 closed 6 years ago

George3d6 commented 7 years ago

Hello,

I have come upon an issue when running the train.py script, I assume it may be a bug due to me overlooking or miss-handling a step but for the life of me I can't figure out which :/

I run the train script as such:

python2 train.py --data-dir ../build-src-Desktop-Debug --data-list ../instr/train.txt --num-classes 2 --batch-size 1

After which it spams the following error ~100 times:

2017-11-06 12:12:17.994330: W tensorflow/core/framework/op_kernel.cc:1192] Not found: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./deeplab_resnet.ckpt

And subsequently crashes with the following error:

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./deeplab_resnet.ckpt
         [[Node: save_1/RestoreV2_527 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2_527/tensor_names, save_1/RestoreV2_527/shape_and_slices)]]

Am I supposed to generate the deeplab_resnet.ckpt file in some way before running the script ? May this be a problem due to an improper training set (I used just two images for this test run and used RGB masks but with only (0,0,0) and (255,255,255) pixels, so I assumed the model will be able to deal with that properly). ? Or is it possible that this is a bug

The python version is: Python 2.7.14

eypros commented 7 years ago

In the git repository there are instructions. You are not supposed to create this file (unless you want to but then you should use the other file "deeplab_resnet_init.ckpt"). Download from here and define the path where it resides. https://drive.google.com/drive/folders/0B_rootXHuswsZ0E4Mjh1ZU5xZVU

DrSleep commented 6 years ago

It is all due to the 'restore-from' parameter. The value of it depends on what your final goal is: if you want to train from scratch, then you should explicitly set this parameter to None; if you want to fine-tune existing model, then you will need to download the model.