google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.26k stars 9.62k forks source link

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[2968] = 30523 is not in [0, 30522) #859

Open lyriccoder opened 5 years ago

lyriccoder commented 5 years ago

I am trying to train my own data for text classification (multiple classes). I'm trying to run it with the following command:

python run_classifier.py --task_name=cola --do_train=true --do_eval=true --do_predict=true --data_dir=./data/ --vocab_file=./uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=./uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=./uncased_L-12_H-768_A-12/bert_model.ckpt --max_seq_length=400 --train_batch_size=8 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=./bert_output/ --do_lower_case=True

I used the following pretrained model: uncased_L-12_H-768_A-12

Data is the following: https://bitbucket.org/lyriccoder/bert/downloads/dev.tsv https://bitbucket.org/lyriccoder/bert/downloads/test.tsv https://bitbucket.org/lyriccoder/bert/downloads/train.tsv

Here is my run_classifier.py. I changed it since I have several classes:

run_classifier.zip I've just changed the number of classes for ColaProcessor:

  def get_labels(self):
    """See base class."""
    return ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

I have the following stacktrace:

Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO:tensorflow:Done calling model_fn.
I0918 17:04:05.368514 15956 estimator.py:1147] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0918 17:04:05.370477 15956 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0918 17:04:08.241855 15956 monitored_session.py:240] Graph was finalized.
2019-09-18 17:04:08.244092: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
WARNING:tensorflow:From C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0918 17:04:08.261773 15956 deprecation.py:323] From C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from ./bert_output/model.ckpt-0
I0918 17:04:08.268754 15956 saver.py:1280] Restoring parameters from ./bert_output/model.ckpt-0
WARNING:tensorflow:From C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
W0918 17:04:13.132208 15956 deprecation.py:323] From C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\saver.py:1066: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
I0918 17:04:13.602988 15956 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0918 17:04:13.804451 15956 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./bert_output/model.ckpt.
I0918 17:04:19.897372 15956 basic_session_run_hooks.py:606] Saving checkpoints for 0 into ./bert_output/model.ckpt.
ERROR:tensorflow:Error recorded from training_loop: indices[2968] = 30523 is not in [0, 30522)
         [[node bert/embeddings/GatherV2 (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:419) ]]

Errors may have originated from an input operation.
Input Source operations connected to node bert/embeddings/GatherV2:
 bert/embeddings/Reshape (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:414)
 bert/embeddings/word_embeddings/read (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:412)

Original stack trace for 'bert/embeddings/GatherV2':
  File "run_classifier.py", line 980, in <module>
    tf.app.run()
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_classifier.py", line 879, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2871, in train
    saving_listeners=saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2709, in _call_model_fn
    config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2967, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1549, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1867, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "run_classifier.py", line 644, in model_fn
    num_labels, use_one_hot_embeddings)
  File "run_classifier.py", line 582, in create_model
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 180, in __init__
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 419, in embedding_lookup
    output = tf.gather(embedding_table, flat_input_ids)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3475, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 4835, in gather_v2
    batch_dims=batch_dims, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

E0918 17:04:33.387931 15956 error_handling.py:70] Error recorded from training_loop: indices[2968] = 30523 is not in [0, 30522)
         [[node bert/embeddings/GatherV2 (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:419) ]]

Errors may have originated from an input operation.
Input Source operations connected to node bert/embeddings/GatherV2:
 bert/embeddings/Reshape (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:414)
 bert/embeddings/word_embeddings/read (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:412)

Original stack trace for 'bert/embeddings/GatherV2':
  File "run_classifier.py", line 980, in <module>
    tf.app.run()
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_classifier.py", line 879, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2871, in train
    saving_listeners=saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2709, in _call_model_fn
    config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2967, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1549, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1867, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "run_classifier.py", line 644, in model_fn
    num_labels, use_one_hot_embeddings)
  File "run_classifier.py", line 582, in create_model
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 180, in __init__
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 419, in embedding_lookup
    output = tf.gather(embedding_table, flat_input_ids)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3475, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 4835, in gather_v2
    batch_dims=batch_dims, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

INFO:tensorflow:training_loop marked as finished
I0918 17:04:33.432810 15956 error_handling.py:96] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W0918 17:04:33.434805 15956 error_handling.py:130] Reraising captured error
Traceback (most recent call last):
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
    return fn(*args)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[2968] = 30523 is not in [0, 30522)
         [[{{node bert/embeddings/GatherV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_classifier.py", line 980, in <module>
    tf.app.run()
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_classifier.py", line 879, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2876, in train
    rendezvous.raise_errors()
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\error_handling.py", line 131, in raise_errors
    six.reraise(typ, value, traceback)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\six.py", line 693, in reraise
    raise value
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2871, in train
    saving_listeners=saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1192, in _train_model_default
    saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1484, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1252, in run
    run_metadata=run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1353, in run
    raise six.reraise(*original_exc_info)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\six.py", line 693, in reraise
    raise value
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1338, in run
    return self._sess.run(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1411, in run
    run_metadata=run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\training\monitored_session.py", line 1169, in run
    return self._sess.run(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
    run_metadata_ptr)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
    run_metadata)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[2968] = 30523 is not in [0, 30522)
         [[node bert/embeddings/GatherV2 (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:419) ]]

Errors may have originated from an input operation.
Input Source operations connected to node bert/embeddings/GatherV2:
 bert/embeddings/Reshape (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:414)
 bert/embeddings/word_embeddings/read (defined at C:\Users\lyriccoder\PycharmProjects\bert\modeling.py:412)

Original stack trace for 'bert/embeddings/GatherV2':
  File "run_classifier.py", line 980, in <module>
    tf.app.run()
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 299, in run
    _run_main(main, args)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\absl\app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "run_classifier.py", line 879, in main
    estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2871, in train
    saving_listeners=saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default
    features, labels, ModeKeys.TRAIN, self.config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2709, in _call_model_fn
    config)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2967, in _model_fn
    features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1549, in call_without_tpu
    return self._call_model_fn(features, labels, is_export_mode=is_export_mode)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1867, in _call_model_fn
    estimator_spec = self._model_fn(features=features, **kwargs)
  File "run_classifier.py", line 644, in model_fn
    num_labels, use_one_hot_embeddings)
  File "run_classifier.py", line 582, in create_model
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 180, in __init__
    use_one_hot_embeddings=use_one_hot_embeddings)
  File "C:\Users\lyriccoder\PycharmProjects\bert\modeling.py", line 419, in embedding_lookup
    output = tf.gather(embedding_table, flat_input_ids)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\array_ops.py", line 3475, in gather
    return gen_array_ops.gather_v2(params, indices, axis, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 4835, in gather_v2
    batch_dims=batch_dims, name=name)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
    op_def=op_def)
  File "C:\Users\lyriccoder\PycharmProjects\bert\venv\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

I have googled a lot and the problem is related to number of words in vocabulary. But this number in the config is large:

{
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  **"vocab_size": 30522**
}

I use only CPU. Also some people tells that the problem happens with CPU only, not GPU.

Here is my PC info:


OS Name:                   Microsoft Windows 10 Enterprise
OS Version:                10.0.17763 N/A Build 17763
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Member Workstation
OS Build Type:             Multiprocessor Free
System Manufacturer:       LENOVO
System Model:              10T7004KRU
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 158 Stepping 10 GenuineIntel ~1704 Mhz
BIOS Version:              LENOVO M1UKT28A, 18.02.2019
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume2
Total Physical Memory:     8 059 MB
Available Physical Memory: 3 743 MB
Virtual Memory: Max Size:  10 098 MB
Virtual Memory: Available: 2 289 MB
Virtual Memory: In Use:    7 809 MB
Hotfix(s):                 6 Hotfix(s) Installed.
                           [01]: KB4483452
                           [02]: KB4470788
                           [03]: KB4489907
                           [04]: KB4497932
                           [05]: KB4512937
                           [06]: KB4511553
Network Card(s):           4 NIC(s) Installed.
                           [01]: Intel(R) Dual Band Wireless-AC 8265
                                 Connection Name: Wi-Fi
                                 Status:          Media disconnected
                           [02]: Intel(R) Ethernet Connection (7) I219-V
                                 Connection Name: Ethernet
                                 IP address(es)
                           [03]: Array Networks SSL VPN Adapter
                                 Connection Name: Ethernet 2
                                 Status:          Hardware not present
                           [04]: VirtualBox Host-Only Ethernet Adapter
                                 Connection Name: VirtualBox Host-Only Network
Hyper-V Requirements:      VM Monitor Mode Extensions: Yes
                           Virtualization Enabled In Firmware: Yes
                           Second Level Address Translation: Yes
                           Data Execution Prevention Available: Yes

Could you please help?

johny-smith commented 4 years ago

Hi, Im having a similar problem. Did you find out what the problem was?

lyriccoder commented 4 years ago

Hi @johny-smith Unfortunately, no. This problem happens if I use CPU only. Seems there is a bug with CPU version, maybe even in tensorflow. I have to buy a GPU unit, it's ok for GPU unit. So, I used a workaround