hq-jiang / instance-segmentation-with-discriminative-loss-tensorflow

Tensorflow implementation of "Semantic Instance Segmentation with a Discriminative Loss Function"
MIT License
170 stars 47 forks source link

error in utils.load_enet #2

Open lps683 opened 6 years ago

lps683 commented 6 years ago

Hello, i really appreciate your work. It helped me a lot. but i met some problems when i ran "python training.py",it said as follow,

Traceback (most recent call last): File "training.py", line 241, in run() File "training.py", line 115, in run last_prelu = utils.load_enet(sess, model_dir, input_image, batch_size) File "/home/lps/DL/instance-segmentation-with-discriminative-loss-tensorflow/utils.py", line 27, in load_enet saver.restore(sess, checkpoint) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1439, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 767, in run run_metadata_ptr) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 965, in _run feed_dict_string, options, run_metadata) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run target_list, options, run_metadata) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Key ENet/bottleneck3_7_batch_norm2/gamma not found in checkpoint [[Node: save/RestoreV2_473 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_473/tensor_names, save/RestoreV2_473/shape_and_slices)]] [[Node: save/RestoreV2_619/_1241 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3105_save/RestoreV2_619", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

Caused by op u'save/RestoreV2_473', defined at: File "training.py", line 241, in run() File "training.py", line 115, in run last_prelu = utils.load_enet(sess, model_dir, input_image, batch_size) File "/home/lps/DL/instance-segmentation-with-discriminative-loss-tensorflow/utils.py", line 26, in load_enet saver = tf.train.Saver(variables_to_restore) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1051, in init self.build() File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1081, in build restore_sequentially=self._restore_sequentially) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 675, in build restore_sequentially, reshape) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 402, in _AddRestoreOps tensors = self.restore_op(filename_tensor, saveable, preferred_shard) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 242, in restore_op [spec.tensor.dtype])[0]) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 668, in restore_v2 dtypes=dtypes, name=name) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op op_def=op_def) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op original_op=self._default_original_op, op_def=op_def) File "/home/lps/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1264, in init self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key ENet/bottleneck3_7_batch_norm2/gamma not found in checkpoint [[Node: save/RestoreV2_473 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_473/tensor_names, save/RestoreV2_473/shape_and_slices)]] [[Node: save/RestoreV2_619/_1241 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_3105_save/RestoreV2_619", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"]()]]

it seems some errors in Enet checkpoint files. I hope you can help me. Thanks!

hq-jiang commented 6 years ago

Hi. The loading worked for other people, so you might want to check your tensorflow version. It should be around 1.2 -1.3, perhaps you have a newer version which is not compatible?

optimal16 commented 5 years ago

@hq-jiang hello,I feel your Project is very nice, but the meanshift cluster is so slow, Do you have good recommendation? thank you

hq-jiang commented 5 years ago

I was also surprised that mean shift is quite slow. What you can do is, increase the number of threads in the mean shift and reduce the tolerance to stop the iterations earlier. Or you could try to adapt the network to output a lower resolution. Than you would have less points to run the mean shift.

By the way, you should create a new issue since it is not related to the current one.

MingtaoFu commented 5 years ago

same problem

pengyiwu commented 5 years ago

same problem

i have the same problem after i upgrade my tensorflow-gpu version to 1.2......................