This repository contains code introduced in the following paper:
Stay Together: A System for Single and Split-antecedent Anaphora Resolution
Juntao Yu, Nafise Moosavi, Silviu Paun and Massimo Poesio
In Proceedings of the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2021
A Cluster Ranking Model for Full Anaphora Resolution
Juntao Yu, Alexandra Uma and Massimo Poesio
In Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020
An early version of the code also can be used to replicate the results in the following paper:
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alexandra Uma and Udo Kruschwitz
In Proceedings of the 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2019
pip install -r requirements.txt
.setup.sh
to download the GloVe embeddings that required by the system and compile the Tensorflow custom kernels.Modifiy the test_path and conll_test_path accordingly:
{
"clusters": [[[0,0],[6,6], [12,12],[22,22],[[2,3],[8,8]], For CoNLL style coreference
"clusters": [[[0,0,-1],[6,6,-1], [12,12,-1], [22,22,-1]],[[2,3,-1],[8,8,-1]],[[15,15,1]]],[[20,20,-1]],[[24,24,-1]], For CRAC style full anaphora resolution
"doc_key": "nw",
"sentences": [["John", "has", "a", "car", "."], ["Did", "He", "washed", "it", "yesteday","?"],["Yes", "he", "did", ",", "it", "was", "sunny", "yesteday", "when", "I", "saw", "him", ",", "we", "even", "had", "a", "good", "chart", "."],
"speakers": [["sp1", "sp1", "sp1", "sp1", "sp1"], ["sp1", "sp1", "sp1", "sp1", "sp1","sp1"],["sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2","sp2"]]
"split_antecedents":[[[24,24],[0,0]],[[24,24],[20,20]]], For CRAC style full anahora resolution with split antecedent enabled (optional)
}
1
and up to n_types - 1
, in the cases that a mention is a referring mention then put -1
instead.extract_bert_features.sh
to compute the BERT embeddings for the test set.python evaluate.py config_name
to start your evaluation.python lea_naacl2021.py key_json_file sys_json_file split_antecedent_importance (optional)
to get the scores.python get_char_vocab.py train.jsonlines dev.jsonlines
extract_bert_features.sh
to compute the BERT embeddings for both training and development sets.python train.py config_name
The cluster ranking model takes about 16 hours to train (200k steps) on a GTX 1080Ti GPU.