UniversalDependencies / UD_English-GUM

Other
30 stars 4 forks source link

SplitAnte with a single antecedent #82

Closed martinpopel closed 3 months ago

martinpopel commented 6 months ago

GUM_essay_sexlife-30 in the dev branch contains a line with SplitAnte=130<140 Running validate.py --lang en --level 2 --coref GUM_essay_sexlife.conllu shows

[L6 Coref only-one-split-antecedent] SplitAnte statement '130<140' must specify at least two antecedents for entity '140'.

BTW: If you switch from GRP to eid (as we've discussed several times), you can benefit from the coreference validation.

amir-zeldes commented 6 months ago

Oh, yes, I see the problem and why our validator didn't pick it up - one of the two antecedents was accidentally also coref-ed to the anaphor, so in reality it was 130<130,140 which is of course impossible. I'll remove the redundant coref and that should fix it.

BTW: If you switch from GRP to eid (as we've discussed several times), you can benefit from the coreference validation.

OK, remind me again what was the issue? Was it that you need unique EIDs per document but not so for GRP? If so then I remember, that was a deal-breaker because it prevents guaranteed validity on concatenation of documents. Why is it not possible to valildate data that uses GRP?