GeorgeSeif / Semantic-Segmentation-Suite

Semantic Segmentation Suite in TensorFlow. Implement, train, and test new Semantic Segmentation models easily!
2.51k stars 880 forks source link

train problem #30

Closed fotsing365 closed 6 years ago

fotsing365 commented 6 years ago

Hi I would like to train RefineNet-Res101 using default CamVid dataset containing in the project but after running this command "python main.py --mode train --dataset CamVid --crop_height 720 --crop_wi dth 960 --batch_size 5 --num_val_images 10 --model RefineNet-Res101" I obtain this error

/home/cedriq/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Preparing the model ... WARNING:tensorflow:From main.py:167: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version. Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default.

See tf.nn.softmax_cross_entropy_with_logits_v2.

This model has 83330464 trainable parameters 2018-03-28 11:45:00.846569: W tensorflow/core/framework/op_kernel.cc:1202] OP_REQUIRES failed at save_restore_tensor.cc:170 : Not found: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt Traceback (most recent call last): File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call return fn(*args) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn target_list, status, run_metadata) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "main.py", line 179, in init_fn(sess) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 690, in callback saver.restore(session, model_path) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1755, in restore {self.saver_def.filename_tensor_name: save_path}) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run run_metadata_ptr) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run feed_dict_tensor, options, run_metadata) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run options, run_metadata) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.NotFoundError: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at: File "main.py", line 142, in network, init_fn = build_refinenet(input, preset_model = args.model, num_classes=num_classes) File "models/RefineNet.py", line 167, in build_refinenet init_fn = slim.assign_from_checkpoint_fn(os.path.join(pretrained_dir, 'resnet_v2_101.ckpt'), slim.get_model_variables('resnet_v2_101')) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/variables.py", line 688, in assign_from_checkpoint_fn saver = tf_saver.Saver(var_list, reshape=reshape_variables) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1293, in init self.build() File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1302, in build self._build(self._filename, build_save=True, build_restore=True) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 1339, in _build build_save=build_save, build_restore=build_restore) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 796, in _build_internal restore_sequentially, reshape) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 449, in _AddRestoreOps restore_sequentially) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/saver.py", line 847, in bulk_restore return io_ops.restore_v2(filename_tensor, names, slices, dtypes) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1030, in restore_v2 shape_and_slices=shape_and_slices, dtypes=dtypes, name=name) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op op_def=op_def) File "/home/cedriq/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for models/resnet_v2_101.ckpt [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

I m new in python and deeplearning Please what should be the problem? excuse me for my bad english I m french speaking

Spritea commented 6 years ago

Bonjour~ Python is telling you that it couldn't find the file resnet_v2_101.ckpt. Since PSPNet and RefineNet require ResNet, so you need to download corresponding ResNet models before using PSPNet and RefineNet. Funny George just updated README.md to get people's attention about this problem.

Feel free to ask any problem you met!

fotsing365 commented 6 years ago

Thanks

GeorgeSeif commented 6 years ago

Yup that's exactly correct!

Thanks @Spritea