google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
38.17k stars 9.6k forks source link

Predicting Movie Reviews with BERT - IndexError: tuple index out of range #456

Closed loretoparisi closed 5 years ago

loretoparisi commented 5 years ago

I'm running the notebook "Predicting Movie Reviews with BERT". It works fine, but as soon as I predict one sentence I'm getting this error, I guess on the tf.estimator.Estimator in predictions = estimator.predict(predict_input_fn) . I'm not sure if the problem is the predict_input_fn.

def getPrediction(in_sentences):
  labels = ["Negative", "Positive"]
  input_examples = [run_classifier.InputExample(guid="", text_a = x, text_b = None, label = 0) for x in in_sentences] # here, "" is just a dummy label
  input_features = run_classifier.convert_examples_to_features(input_examples, label_list, MAX_SEQ_LENGTH, tokenizer)
  predict_input_fn = run_classifier.input_fn_builder(features=input_features, seq_length=MAX_SEQ_LENGTH, is_training=False, drop_remainder=False)
  return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

my input was

pred_sentences = [
  "I love to eat sea food"
]

Stacktrace:

INFO:tensorflow:Writing example 0 of 1
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: 
INFO:tensorflow:tokens: [CLS] i love to eat sea food [SEP]
INFO:tensorflow:input_ids: 101 1045 2293 2000 4521 2712 2833 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:label: 0 (id = 0)
input_examples:1 input_features:1
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Saver not created because there are no variables in the graph to restore
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from OUTPUT_DIR_NAME/model.ckpt-468
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-67-770bf0871d3e> in <module>()
----> 1 predictions = getPrediction(pred_sentences)

<ipython-input-61-2a3bf4ef91b5> in getPrediction(in_sentences)
      8 
      9   predictions = estimator.predict(predict_input_fn)
---> 10   return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

<ipython-input-61-2a3bf4ef91b5> in <listcomp>(.0)
      8 
      9   predictions = estimator.predict(predict_input_fn)
---> 10   return [(sentence, prediction['probabilities'], labels[prediction['labels']]) for sentence, prediction in zip(in_sentences, predictions)]

/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py in predict(self, input_fn, predict_keys, hooks, checkpoint_path, yield_single_examples)
    634                 yield pred
    635             else:
--> 636               for i in range(self._extract_batch_length(preds_evaluated)):
    637                 yield {
    638                     key: value[i]

/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py in _extract_batch_length(self, preds_evaluated)
    997     for key, value in six.iteritems(preds_evaluated):
    998       batch_length = batch_length or value.shape[0]
--> 999       if value.shape[0] != batch_length:
   1000         raise ValueError('Batch length of predictions should be same. %s has '
   1001                          'different batch length than others.' % key)

IndexError: tuple index out of range
miaosenwang commented 5 years ago

You can try: predictions = estimator.predict(input_fn=predict_input_fn, yield_single_examples=False)

loretoparisi commented 5 years ago

@miaosenwang Thanks a lot, I will try it.

mjspeck commented 5 years ago

@loretoparisi did this resolve the problem? I'm having the same issue.

loretoparisi commented 5 years ago

@mjspeck let me have a look again!

loretoparisi commented 5 years ago

@mjspeck @miaosenwang yes it works now! Here is the full working code with the new function for single prediction https://github.com/loretoparisi/bert-movie-reviews-sentiment-classifier

mjspeck commented 5 years ago

@loretoparisi thank you!