Marking coref mentions in benchmark dataset

agolo-alan-hogue commented 11 months ago

Hello,

I'd like to create a benchmarking dataset for Elevant which includes coref mentions. I assume the coref mentions need to be labeled as such somehow so that coref can be turned on and off for evaluation. However, looking at the json schema I do not see a field in the ground truth labels for that. I do see that for predictions "coreference" is stored in the "recognized_by" field, but I don't see something similar for the ground truth labels. How should I treat coref ground truth mentions in a benchmarking dataset?

agolo-alan-hogue commented 11 months ago

I think I found where this is happening. So, I guess Elevant is attempting to infer what is and is not a coref mention from the form of the mention. If it recognizes the mention string as a pronoun, it's considered coref. If the mention starts with certain determiners, it is considered coref. I'm seeing this in src/evaluation/mention_type.py. Am I reading this correctly?

The problem here is that our coref model doesn't typically include determiners in the mention, and therefore our gold annotations also don't include them. Nominal coref mentions look like "The [cat] wants [its] food." rather than "[The cat]". So IINM this means that in our new benchmarking dataset, none of the nominal coref mentions will be recognized as such, and this will mess up our numbers.

More generally, I think this might not be a perfect heuristic. For one thing, in certain styles, leaving out determiners is fairly common. And then it's also specific to EN.

Is there an easy way around this? Or could we just add an optional field that indicates that a mention is coref?

Thanks!

flackbash commented 10 months ago

Cool timing, because I actually added support for this 2 weeks ago (a171e124c627f10f15d8defbf71d7ba23d3f5ff6), because I absolutely agree with you, that inferring whether something is a coreference from the mention text alone is not completely reliable.

Right now, adding this property is only supported for the internal jsonl format used by ELEVANT (denoted as ours when using the -bformat argument to the add_benchmark.py script). But it should be pretty straight forward to make this an optional property in our simple-jsonl format as well. Would that help you?

Also, sorry for the late response, I was sick the entire last week.

agolo-alan-hogue commented 10 months ago

Yes, that would be very helpful as I am already using simple-jsonl for the benchmark dataset, thanks!

flackbash commented 10 months ago

You can now set the property coref for each of your labels in the simple-jsonl format.

Note that if you don't set this property, it will still be inferred from the mention text whether something is a coreference or not. Only if you explicitly set it to true or false for a label will this property be used to infer it.

Note also, that for evaluation cases that don't have a corresponding ground truth (i.e. a prediction that is a false positive and does not match any ground truth span) it is still inferred from the mention text whether the False Positive is a coreference FP or an entity linking FP.

Let me know if this solves your issue or if you experience any problems with that.

ad-freiburg / elevant

Marking coref mentions in benchmark dataset #15