kentonl / e2e-coref

End-to-end Neural Coreference Resolution
Apache License 2.0
518 stars 174 forks source link

InvalidArgumentError while loading pre-trained model #55

Closed dungtn closed 5 years ago

dungtn commented 5 years ago

Hi @kentonl,

Thank you for the great paper πŸ™πŸ˜Š I'm trying to run the pre-trained model by following the batched prediction instructions. I got the InvalidArgumentError while running python predict.py final mydata.jsonlines output.jsonlines.

InvalidArgumentError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Assign requires shapes of both tensors to match. lhs shape= [1,8] rhs shape= [115,8] [[node save_2/Assign_2 (defined at .../e2e-coref/coref_model.py:90) = Assign[T=DT_FLOAT, _class=["loc:@char_embeddings"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](char_embeddings, save_2/RestoreV2/_9)]] [[{{node save_2/RestoreV2/_42}} = _Send[T=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_43_save_2/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]save_2/RestoreV2:19]]

I found the same issue #10. However, it isn't clear to me how to resolve this. Can you help point out what I'm missing here?

Thank you!

dungtn commented 5 years ago

The tensor mismatch error was caused by incorrectly generated character vocabulary in char_vocab.english.txt. This file is generated by running get_char_vocab.py in setup_training.sh. The python script doesn't work with python 3. So, you either fix them to work with python 3 or run this part of the code with python 2. The model itself worked with python 3 though.

It'll be best if the character vocabulary is also provided with the pre-trained model so it's easier to evaluate and use the coreference model.