allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

InvalidArgumentError while finetuning on new dataset #197

Closed ravikrn closed 5 years ago

ravikrn commented 5 years ago

InvalidArgumentError (see above for traceback): indices[0] = 1043672 is not in [0, 793471)

I am using new dataset as per the instructions and I am getting this error. Could you please help me.

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 1043672 is not in [0, 793471) [[{{node lm/sampled_softmax_loss_1/embedding_lookup_1}}]] [[{{node clip_by_global_norm/clip_by_global_norm/_8}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "bin/restart.py", line 57, in main(args) File "bin/restart.py", line 42, in main restart_ckpt_file=ckpt_file) File "/usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py", line 861, in train feed_dict=feed_dict File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 929, in run run_metadata_ptr) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 1043672 is not in [0, 793471) [[node lm/sampled_softmax_loss_1/embedding_lookup_1 (defined at /usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py:506) ]] [[node clip_by_global_norm/clip_by_global_norm/_8 (defined at /usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py:913) ]]

Caused by op 'lm/sampled_softmax_loss_1/embedding_lookup_1', defined at: File "bin/restart.py", line 57, in main(args) File "bin/restart.py", line 42, in main restart_ckpt_file=ckpt_file) File "/usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py", line 705, in train model = LanguageModel(options, True) File "/usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py", line 72, in init self._build() File "/usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py", line 430, in _build self._build_loss(lstm_outputs) File "/usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py", line 506, in _build_loss num_true=1) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py", line 1864, in sampled_softmax_loss seed=seed) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py", line 1399, in _compute_sampled_logits biases, all_ids, partition_strategy=partition_strategy) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/embedding_ops.py", line 316, in embedding_lookup transform_fn=None) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/embedding_ops.py", line 133, in _embedding_lookup_and_transform result = _clip(array_ops.gather(params[0], ids, name=name), File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, *kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py", line 3273, in gather return gen_array_ops.gather_v2(params, indices, axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3748, in gather_v2 "GatherV2", params=params, indices=indices, axis=axis, name=name) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func return func(args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op op_def=op_def) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[0] = 1043672 is not in [0, 793471) [[node lm/sampled_softmax_loss_1/embedding_lookup_1 (defined at /usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py:506) ]] [[node clip_by_global_norm/clip_by_global_norm/_8 (defined at /usr/local/lib/python3.6/dist-packages/bilm-0.1.post5-py3.6.egg/bilm/training.py:913) ]]

matt-peters commented 5 years ago

This is an indexing error where there vocabulary file is longer then the pretrained softmax matrix. Make sure the vocabulary file, vocabulary size in the options file, and checkpoint are all consistent between pretraining and fine tuning.