Closed andyyuan78 closed 6 years ago
It may be caused by bug of tf.scatter_nd in lower Tensorflow version. What's the version of your Tensorflow?
tf1.2 on macbook without GPU
it works well on my GPU server
I once encountered the same error on my MacBook. You can try to upgrade your Tensorflow to 1.3 or newer on your MacBook.
not works on tf1.4 and tf1.3
python3 sick_rl_main.py --network_type resan --base_name resan --model_dir_suffix training --gpu -1
==>model_title: nt_resan_d_0.7_w_0.0001_um_True_ml_False_o_adadelta_lr_0.5_rs_0.006_rs_sim
debug: False mode: train network_type: resan log_period: 2000 eval_period: 500 gpu: -1 gpu_mem: None model_dir_suffix: training save_model: False load_model: False load_step: None load_path: None use_mse: True mse_logits: False max_epoch: 200 num_steps: 25000 train_batch_size: 128 test_batch_size: 100 optimizer: adadelta learning_rate: 0.5 dropout: 0.7 wd: 0.0001 var_decay: 0.999 decay: 0.9 word_embedding_length: 300 glove_corpus: 6B use_glove_unk_token: True lower_word: True use_char_emb: False use_token_emb: True char_embedding_length: 8 char_out_size: 150 out_channel_dims: 50,50,50 filter_heights: 1,3,5 highway_layer_num: 2 hidden_units_num: 300 fine_tune: False batch_norm: False activation: relu pretrained_path: None base_name: resan start_only_rl: 6000 end_only_rl: 6500 step_for_sl: 1000 step_for_rl: 1000 rl_sparsity: 0.006 rl_strategy: sim
Trying to load processed data from /Users//ReSAN/SICK_rl_pub/result/processed_data/processed_lw_True_ugut_True_gc_6B_wel_300.pickle Have found the file, loading... Done
building resan neural network structure... regularization var num: 14 trainable var num: 37 Trainable Parameters Number: 2527207 2018-04-08 10:49:23.210030: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210081: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210086: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210090: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py", line 89, in exit next(self.gen) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] [[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "sick_rl_main.py", line 127, in
tf.app.run()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "sick_rl_main.py", line 112, in main
train()
File "sick_rl_main.py", line 79, in train
sess, sample_batch, get_summary=if_get_summary, global_step_value=global_step)
File "/Users//ReSAN/SICK_rl_pub/src/model/model_template.py", line 260, in step
[self.loss_sl, summary_tf, self.train_op_sl], feed_dict=feed_dict)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300]
[[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]
Caused by op 'resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd', defined at: File "sick_rl_main.py", line 127, in
tf.app.run()
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "sick_rl_main.py", line 112, in main
train()
File "sick_rl_main.py", line 49, in train
len(train_data_obj.dicts['char']), train_data_obj.max_lens['token'], scope.name)
File "/Users//ReSAN/SICK_rl_pub/src/model/model_resan.py", line 23, in init
self.update_tensor_add_ema_and_opt()
File "/Users//ReSAN/SICK_rl_pub/src/model/model_template.py", line 168, in update_tensor_add_ema_and_opt
(self.s1_percentage, self.s2_percentage) = self.build_network()
File "/Users//ReSAN/SICK_rl_pub/src/model/model_resan.py", line 99, in build_network
'dir_attn_bw', cfg.dropout, self.is_train, cfg.wd, 'relu'
File "/Users//ReSAN/SICK_rl_pub/src/nn_utils/resa.py", line 58, in directional_attention_with_selections
lambda: tf.scatter_nd(
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 289, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1823, in cond
orig_res_f, res_f = context_f.BuildCondBranch(false_fn)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1689, in BuildCondBranch
original_result = fn()
File "/Users//ReSAN/SICK_rl_pub/src/nn_utils/resa.py", line 59, in
tf.stack([range_head, head_org_idx], -1), attn_result, [bs, sl+1, hn])
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2793, in scatter_nd
shape=shape, name=name)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] [[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]