kentonl / e2e-coref

End-to-end Neural Coreference Resolution
Apache License 2.0
518 stars 174 forks source link

ValueError: max() arg is an empty sequence #56

Open danyaljj opened 5 years ago

danyaljj commented 5 years ago

Here is the error I am getting:

(env3.6) khashab2@dickens:~/ideaProjects/e2e-coref$ python demo.py final

[nltk_data] Downloading package punkt to /home/khashab2/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Setting CUDA_VISIBLE_DEVICES to: 
Running experiment: final
max_top_antecedents = 50
max_training_sentences = 50
top_span_ratio = 0.4
filter_widths = [
  3
  4
  5
]
filter_size = 50
char_embedding_size = 8
char_vocab_path = "char_vocab.english.txt"
context_embeddings {
  path = "glove.840B.300d.txt"
  size = 300
}
head_embeddings {
  path = "glove_50_300_2.txt"
  size = 300
}
contextualization_size = 200
contextualization_layers = 3
ffnn_size = 150
ffnn_depth = 2
feature_size = 20
max_span_width = 30
use_metadata = true
use_features = true
model_heads = true
coref_depth = 2
lm_layers = 3
lm_size = 1024
coarse_to_fine = true
max_gradient_norm = 5.0
lstm_dropout_rate = 0.4
lexical_dropout_rate = 0.5
dropout_rate = 0.2
optimizer = "adam"
learning_rate = 0.001
decay_rate = 0.999
decay_frequency = 100
train_path = "train.english.jsonlines"
eval_path = "test.english.jsonlines"
conll_eval_path = "test.english.v4_gold_conll"
lm_path = ""
genres = [
  "bc"
  "bn"
  "mz"
  "nw"
  "pt"
  "tc"
  "wb"
]
eval_frequency = 5000
report_frequency = 100
log_root = "logs"
log_dir = "logs/final"
Loading word embeddings from glove.840B.300d.txt...
Done loading word embeddings.
Loading word embeddings from glove_50_300_2.txt...
Done loading word embeddings.
2019-05-05 13:26:44.945108: W tensorflow/core/graph/graph_constructor.cc:1265] Importing a graph with a lower producer version 26 into an existing graph with producer version 27. Shape inference will have run different parts of the graph with different producer versions.
/home/khashab2/ideaProjects/e2e-coref/env3.6/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2019-05-05 13:26:56.805005: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Restoring from logs/final/model.max.ckpt
Document text: Traceback (most recent call last):
  File "demo.py", line 50, in <module>
    print_predictions(make_predictions(text, model))
  File "demo.py", line 32, in make_predictions
    tensorized_example = model.tensorize_example(example, is_training=False)
  File "/home/khashab2/ideaProjects/e2e-coref/coref_model.py", line 138, in tensorize_example
    max_sentence_length = max(len(s) for s in sentences)
ValueError: max() arg is an empty sequence

The specs of the environment:

SaltyHash123 commented 5 years ago

I have the same issue. Seems it does not accept linebreaks. Whenever i paste texts which contain linebreaks (or alternatively just an empty input), this error occurs. To get some results you could preprocess your text by removing the linebreaks by hand or with a tool (e.g. sed).

Edit: Actually it supports linebreaks. When pasting it in the terminal with ctrl + shift + v, textblocks will be regarded as several input. The linebreak will be regarded as an empty input