Transfer-en performance on Ontonotes-en

pitrack / incremental-coref

Code for "Moving on from OntoNotes: Coreference Resolution Model Transfer" and "Incremental Neural Coreference Resolution in Constant Memory"

Apache License 2.0

17 stars 4 forks source link

Transfer-en performance on Ontonotes-en #7

Closed sm354 closed 2 years ago

sm354 commented 2 years ago

We found that evaluating the given transfer-en model on OntoNotes (EN) gives the following scores: muc: 0.6939 0.2912 0.4103 b_cubed: 0.5365 0.1779 0.2672 ceafe: 0.4446 0.1548 0.2296 em: 0.0393 0.0137 0.0203 mentions: 0.7901 0.3187 0.4542

This seems to be quite less compared to spb_on_512 (79.4 avg. F1) which we have been able to reproduce. Could you please share the performance you have got or any insights on these scores?

pitrack commented 2 years ago

You should be getting ~75 F1. Are you sure you are using the right tokenization?

The Transfer-en model is trained using XLM-R tokenization while all the English datasets and models (e.g. QBCoref, OntoNotes, ARRAU) use English BERT tokenization by default. To evaluate on English documents with the transfer-en model, those documents need to be first tokenized following XLM-R tokenization. Using the minimize.py script, this should be doable by switching the tokenization model from bert to xlmr.

sm354 commented 2 years ago

Thanks for clarifying about the tokenization. I was using the default English BERT tokenization. I am able to get 75.35 F1 after using XLM-R tokenization after making the changes in minimize.py as you described.