ValueError: max() arg is an empty sequence

isaacmg commented 5 years ago

When training I'm getting the following error. The code seems to train well for awhile but then encounters this error about an hour in.

  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/run.py", line 21, in <module>
    run()
  File "/usr/local/lib/python3.6/dist-packages/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/usr/local/lib/python3.6/dist-packages/allennlp/commands/__init__.py", line 102, in main
    args.func(args)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/commands/train.py", line 116, in train_model_from_args
    args.cache_prefix)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/commands/train.py", line 160, in train_model_from_file
    cache_directory, cache_prefix)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/commands/train.py", line 243, in train_model
    metrics = trainer.train()
  File "/usr/local/lib/python3.6/dist-packages/allennlp/training/trainer.py", line 493, in train
    val_loss, num_batches = self._validation_loss()
  File "/usr/local/lib/python3.6/dist-packages/allennlp/training/trainer.py", line 430, in _validation_loss
    loss = self.batch_loss(batch_group, for_training=False)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/training/trainer.py", line 263, in batch_loss
    output_dict = self.model(**batch)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/content/MultiQA/models/multiqa_bert.py", line 195, in forward
    f1_score = squad_eval.metric_max_over_ground_truths(squad_eval.f1_score, best_span_string, gold_answer_texts)
  File "/usr/local/lib/python3.6/dist-packages/allennlp/tools/squad_eval.py", line 52, in metric_max_over_ground_truths
    return max(scores_for_ground_truths)
ValueError: max() arg is an empty sequence

I'm not sure if it is related to my dataset specifically or is a more general bug.

alontalmor commented 5 years ago

Hi,

It seems the model is not predicting anything or your data does not contain a clear gold_answer_texts (list of gold answers) for this example. Can you please specify on which data you are working on?

thanks Alon

isaacmg commented 5 years ago

Yes I'm working on emrQA (a custom dataset) I believe that the error might be related to my preprocessing of the dataset. I wrapped that part of the code in a try/except block and it seems to only fail a few times. However, I'm trying to find a way remove anything without a gold_answer_texts in the preprocessing.

alontalmor / MultiQA

ValueError: max() arg is an empty sequence #6