to employ BERT's ability of passage-level understanding.
The model achieved SOTA on the GAP and OntoNotes benchmarks. The qualitative analysis showed that (1) handling pronouns in conversations and (2) mention paraphrasing are still difficult for the model.
Authors
Mandar Joshi, Omer Levy, Daniel S. Weld, and Luke Zettlemoyer
(University of Washington, AI2, FAIR)
Motivation
BERT's major improvement is passage-level training, which allows it to better model longer sequences
Can we apply it to CR task?
Method
Proposed BERT-based CR method.
Two ways of extending the c2f-coref, ELMo-based CR model:
The independent variant uses non-overlapping segments each of which acts as an independent instance for BERT
The overlap variant splits the document into overlapping segments so as to provide the model with context beyond 512 tokens
Results / Insight
Dataset
GAP: human-labeled dataset of pronoun-name pairs from Wikipedia snippets
OntoNotes 5.0: document-level dataset from the CoNLL-2012
Results
Achieved SOTA on the GAP and OntoNotes benchmarks
with +6.2 F1 (baseline: BERT+RR) and +0.3 F1 (baseline: EE)
The overlap variant offers no improvement over independent
Insight
Unable to handle conversations: Modeling pronouns especially in the context of conversations (Table 3), continues to be difficult for all models, perhaps partly because c2f-coref does very little to model dialog structure of the document.
Importance of entity information: The models are unable to resolve cases requiring mention paraphrasing.
E.g., Bridging the Royals with Prince Charles and his wife Camilla likely requires pretraining models to encode relations between entities
A. "Recent work (Joshi et al., 2019) suggests that BERT’s inability to use longer sequences effectively is likely a by-product pretraining on short sequences for a vast majority of updates."
BERT for Coreference Resolution: Baselines and Analysis
Contribution summary
Authors
Mandar Joshi, Omer Levy, Daniel S. Weld, and Luke Zettlemoyer (University of Washington, AI2, FAIR)
Motivation
Method
Results / Insight
Dataset
Results
Insight