clarkkev / deep-coref

270 stars 59 forks source link

Exception when training its own models #4

Closed jplu closed 7 years ago

jplu commented 7 years ago

Hello,

I'm trying to reproduce the training and unfortunately I'm running to an exception during the process. I have extracted the features using the NeuralCorefDataExporter to get the features in JSON, and then run the Python code with python run_all.py. After a moment I get the following exception:

Loading data
Traceback (most recent call last):
  File "run_all.py", line 93, in <module>
    train_best_model()
  File "run_all.py", line 88, in train_best_model
    train_and_test_pairwise(model_properties.MentionRankingProps(), mode='reward_rescaling')
  File "run_all.py", line 68, in train_and_test_pairwise
    train_pairwise(model_props, mode=mode)
  File "run_all.py", line 59, in train_pairwise
    pretrain(model_props)
  File "run_all.py", line 33, in pretrain
    pairwise_learning.train(model_props, n_epochs=150)
  File "/opt/deep-coref/pairwise_learning.py", line 313, in train
    model_props, with_ids=True)
  File "/opt/deep-coref/datasets.py", line 309, in __init__
    for ana in range(0, me - ms)])
ValueError: need at least one array to concatenate

After checking the JSON from train, dev and test, apparently they contains empty features like:

{"mentions":{},"labels":{},"pair_feature_names":["same-speaker","antecedent-is-mention-speaker","mention-is-antecedent-speaker","relaxed-head-match","exact-string-match","relaxed-string-match"],"pair_features":{},"document_features":{"type":1,"source":"wb"}}

Which I suppose is normal as if there is no coref, there is nothing to extract then. So I checked the code in datasets.py to print the content of doc_mentions and indeed, the value me - ms can be 0 as the content looks like:

[[0   117]
 [117   305]
 [305  522]
 ...,
 [71818 71818]
 [71818 71818]
 [71818 71818]]

I certainly did something wrong in my process but I don't see what. Any help will be appreciated.

Thanks!

clarkkev commented 7 years ago

Feel free to email me if you're still running into this issue!