taoshen58 / ReSAN

Apache License 2.0
27 stars 6 forks source link

InvalidArgumentError (see above for traceback): Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] on Mac notebook #2

Closed andyyuan78 closed 6 years ago

andyyuan78 commented 6 years ago

python3 sick_rl_main.py --network_type resan --base_name resan --model_dir_suffix training --gpu -1

==>model_title: nt_resan_d_0.7_w_0.0001_um_True_ml_False_o_adadelta_lr_0.5_rs_0.006_rs_sim

debug: False mode: train network_type: resan log_period: 2000 eval_period: 500 gpu: -1 gpu_mem: None model_dir_suffix: training save_model: False load_model: False load_step: None load_path: None use_mse: True mse_logits: False max_epoch: 200 num_steps: 25000 train_batch_size: 128 test_batch_size: 100 optimizer: adadelta learning_rate: 0.5 dropout: 0.7 wd: 0.0001 var_decay: 0.999 decay: 0.9 word_embedding_length: 300 glove_corpus: 6B use_glove_unk_token: True lower_word: True use_char_emb: False use_token_emb: True char_embedding_length: 8 char_out_size: 150 out_channel_dims: 50,50,50 filter_heights: 1,3,5 highway_layer_num: 2 hidden_units_num: 300 fine_tune: False batch_norm: False activation: relu pretrained_path: None base_name: resan start_only_rl: 6000 end_only_rl: 6500 step_for_sl: 1000 step_for_rl: 1000 rl_sparsity: 0.006 rl_strategy: sim

Trying to load processed data from /Users//ReSAN/SICK_rl_pub/result/processed_data/processed_lw_True_ugut_True_gc_6B_wel_300.pickle Have found the file, loading... Done

building resan neural network structure... regularization var num: 14 trainable var num: 37 Trainable Parameters Number: 2527207 2018-04-08 10:49:23.210030: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210081: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210086: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations. 2018-04-08 10:49:23.210090: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations. Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1139, in _do_call return fn(*args) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1121, in _run_fn status, run_metadata) File "/usr/local/Cellar/python3/3.6.0_1/Frameworks/Python.framework/Versions/3.6/lib/python3.6/contextlib.py", line 89, in exit next(self.gen) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status pywrap_tensorflow.TF_GetCode(status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] [[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "sick_rl_main.py", line 127, in tf.app.run() File "/usr/local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "sick_rl_main.py", line 112, in main train() File "sick_rl_main.py", line 79, in train sess, sample_batch, get_summary=if_get_summary, global_step_value=global_step) File "/Users//ReSAN/SICK_rl_pub/src/model/model_template.py", line 260, in step [self.loss_sl, summary_tf, self.train_op_sl], feed_dict=feed_dict) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 789, in run run_metadata_ptr) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 997, in _run feed_dict_string, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run target_list, options, run_metadata) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] [[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]

Caused by op 'resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd', defined at: File "sick_rl_main.py", line 127, in tf.app.run() File "/usr/local/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "sick_rl_main.py", line 112, in main train() File "sick_rl_main.py", line 49, in train len(train_data_obj.dicts['char']), train_data_obj.max_lens['token'], scope.name) File "/Users//ReSAN/SICK_rl_pub/src/model/model_resan.py", line 23, in init self.update_tensor_add_ema_and_opt() File "/Users//ReSAN/SICK_rl_pub/src/model/model_template.py", line 168, in update_tensor_add_ema_and_opt (self.s1_percentage, self.s2_percentage) = self.build_network() File "/Users//ReSAN/SICK_rl_pub/src/model/model_resan.py", line 99, in build_network 'dir_attn_bw', cfg.dropout, self.is_train, cfg.wd, 'relu' File "/Users//ReSAN/SICK_rl_pub/src/nn_utils/resa.py", line 58, in directional_attention_with_selections lambda: tf.scatter_nd( File "/usr/local/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 289, in new_func return func(*args, **kwargs) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1823, in cond orig_res_f, res_f = context_f.BuildCondBranch(false_fn) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 1689, in BuildCondBranch original_result = fn() File "/Users//ReSAN/SICK_rl_pub/src/nn_utils/resa.py", line 59, in tf.stack([range_head, head_org_idx], -1), attn_result, [bs, sl+1, hn]) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2793, in scatter_nd shape=shape, name=name) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1269, in init self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Invalid indices: [0,5,0] = [0, -1] does not index into [128,20,300] [[Node: resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd = ScatterNd[T=DT_FLOAT, Tindices=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"](resan/ct_attn/dir_attn_bw_1/output/cond/stack, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/Switch, resan/ct_attn/dir_attn_bw_1/output/cond/ScatterNd/shape)]]

taoshen58 commented 6 years ago

It may be caused by bug of tf.scatter_nd in lower Tensorflow version. What's the version of your Tensorflow?

andyyuan78 commented 6 years ago

tf1.2 on macbook without GPU

it works well on my GPU server

taoshen58 commented 6 years ago

I once encountered the same error on my MacBook. You can try to upgrade your Tensorflow to 1.3 or newer on your MacBook.

andyyuan78 commented 6 years ago

not works on tf1.4 and tf1.3