localminimum / R-net

A Tensorflow Implementation of R-net: Machine reading comprehension with self matching networks
MIT License
323 stars 122 forks source link

Training: cannot reshape array of size 27481500 into shape (91604,300) #28

Closed brojokm closed 6 years ago

brojokm commented 6 years ago

python2 model.py Training... Loading question data... Loading passage data... Preparing data... Total number of trainable parameters: 1195735 Built model Loading question data... Loading passage data... Preparing data... Traceback (most recent call last): File "model.py", line 293, in main() File "model.py", line 250, in main glove = np.reshape(glove,(Params.vocab_size,Params.emb_size)) File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 257, in reshape return _wrapfunc(a, 'reshape', newshape, order=order) File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc return getattr(obj, method)(*args, **kwds) ValueError: cannot reshape array of size 27481500 into shape (91604,300)

theSage21 commented 6 years ago

I remember a similar issue being raised somewhere.

Your give file has one extra word. Removing the last word from the file should fix this.

On Sun 25 Mar, 2018, 22:40 brojokm, notifications@github.com wrote:

python2 model.py Training... Loading question data... Loading passage data... Preparing data... Total number of trainable parameters: 1195735 Built model Loading question data... Loading passage data... Preparing data... Traceback (most recent call last): File "model.py", line 293, in main() File "model.py", line 250, in main glove = np.reshape(glove,(Params.vocab_size,Params.emb_size)) File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 257, in reshape return _wrapfunc(a, 'reshape', newshape, order=order) File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 52, in _wrapfunc return getattr(obj, method)(*args, **kwds) ValueError: cannot reshape array of size 27481500 into shape (91604,300)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/minsangkim142/R-net/issues/28, or mute the thread https://github.com/notifications/unsubscribe-auth/AHVj0a2XpTjN12M518L8dU9J41_DPhTdks5th8-ggaJpZM4S6OA7 .

brojokm commented 6 years ago

Now I am getting the following error.

` 9%|██▌ | 288/3226 [15:45<2:40:47, 3.28s/b] Dev_loss: 3.97158241272 Dev_Exact_match: 0.03125 Dev_F1_score: 0.111962455437 10%|██▉ | 320/3226 [17:29<2:38:51, 3.28s/b]Traceback (most recent call last): File "model.py", line 294, in main() File "model.py", line 270, in main index, dev_loss = sess.run([model.output_index, model.mean_loss], feed_dict = feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[19,46] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embeddings/read, _arg_batch_0_0)]]

Caused by op u'passage_embeddings/embedding_lookup', defined at: File "model.py", line 294, in main() File "model.py", line 243, in main model = Model(is_training = True); print("Built model") File "model.py", line 70, in init self.encode_ids() File "model.py", line 95, in encode_ids scope = "passage_embeddings") File "/users/mtech/brojokm/TRAIN/R-net-master/layers.py", line 48, in encoding word_encoding = tf.nn.embedding_lookup(word_embeddings, word) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 325, in embedding_lookup transform_fn=None) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 150, in _embedding_lookup_and_transform result = _clip(_gather(params[0], ids, name=name), ids, max_norm) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 54, in _gather return array_ops.gather(params, ids, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2585, in gather params, indices, validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1864, in gather validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1625, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[19,46] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embeddings/read, _arg_batch_0_0)]] `

theSage21 commented 6 years ago

Well this days you're looking for a word which is not there. I'll take a look once I'm on a stable internet connection.

On Mon 26 Mar, 2018, 15:43 brojokm, notifications@github.com wrote:

Now I am getting the following error.

` 9%|██▌ | 288/3226 [15:45<2:40:47, 3.28s/b] Dev_loss: 3.97158241272 Dev_Exact_match: 0.03125 Dev_F1_score: 0.111962455437 10%|██▉ | 320/3226 [17:29<2:38:51, 3.28s/b]Traceback (most recent call last): File "model.py", line 294, in main() File "model.py", line 270, in main index, dev_loss = sess.run([model.output_index, model.mean_loss], feed_dict = feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1128, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[19,46] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = GatherTindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op u'passage_embeddings/embedding_lookup', defined at: File "model.py", line 294, in main() File "model.py", line 243, in main model = Model(is_training = True); print("Built model") File "model.py", line 70, in init self.encode_ids() File "model.py", line 95, in encode_ids scope = "passage_embeddings") File "/users/mtech/brojokm/TRAIN/R-net-master/layers.py", line 48, in encoding word_encoding = tf.nn.embedding_lookup(word_embeddings, word) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 325, in embedding_lookup transform_fn=None) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 150, in _embedding_lookup_and_transform result = _clip(_gather(params[0], ids, name=name), ids, max_norm) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 54, in _gather return array_ops.gather(params, ids, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2585, in gather params, indices, validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1864, in gather validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3160, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1625, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[19,46] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = GatherTindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"]] `

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/minsangkim142/R-net/issues/28#issuecomment-376137512, or mute the thread https://github.com/notifications/unsubscribe-auth/AHVj0VxIwmlZxrYt4hKSVyQTV73j2VR1ks5tiNRXgaJpZM4S6OA7 .

jeffreyflynt commented 6 years ago

I increased vocab_size = 91604 to vocab_size = 91605 in params.py and the error went away.

ghost commented 6 years ago

As @jeffreyflynt said, just increase the dictionary size by 1 until there is no dictionary size error.

brojokm commented 6 years ago

I have done the above.. But it has failed after 8% of training.. I have already post the same issue above. But didn't get the its solution.

Dev_F1_score: 0.183340548341 7%|█▉ | 300/4129 [2:09:28<27:32:33, 25.90s/b] Dev_loss: 3.72519350052 Dev_Exact_match: 0.1 Dev_F1_score: 0.178666666667 8%|██▏ | 350/4129 [2:30:47<27:08:09, 25.85s/b]Traceback (most recent call last): File "model.py", line 293, in main() File "model.py", line 269, in main index, dev_loss = sess.run([model.output_index, model.mean_loss], feed_dict = feed_dict) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 905, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1140, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[32,70] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embeddings/read, _arg_batch_0_0)]]

Caused by op u'passage_embeddings/embedding_lookup', defined at: File "model.py", line 293, in main() File "model.py", line 243, in main model = Model(is_training = True); print("Built model") File "model.py", line 70, in init self.encode_ids() File "model.py", line 95, in encode_ids scope = "passage_embeddings") File "/home/R-net-NW/layers.py", line 48, in encoding word_encoding = tf.nn.embedding_lookup(word_embeddings, word) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 327, in embedding_lookup transform_fn=None) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 151, in _embedding_lookup_and_transform result = _clip(_gather(params[0], ids, name=name), ids, max_norm) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/embedding_ops.py", line 55, in _gather return array_ops.gather(params, ids, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 2698, in gather params, indices, validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2672, in gather validate_indices=validate_indices, name=name) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3290, in create_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1654, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): indices[32,70] = 91604 is not in [0, 91604) [[Node: passage_embeddings/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@word_embeddings"], validate_indices=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](word_embeddings/read, _arg_batch_0_0)]]