Closed gotutiyan closed 8 months ago
Heya! That's a good question. It looks like a case of ERRANT not merging something I would have hoped it would merge.
Specifically, it seems the human annotator wanted to add , or something ,
(note the insertion of 2 commas) into the reference to make it make sense with the reference to a place
later in the sentence. I would have hoped ERRANT would group this together as a single multi-word insertion edit: e.g.
A 7 7|||M:OTHER|||, or something,|||REQUIRED|||-NONE-|||0
... but it instead chooses to split it into 4 separate insertion edits at the same place.
A 7 7|||M:PUNCT|||,|||REQUIRED|||-NONE-|||0
A 7 7|||M:CONJ|||or|||REQUIRED|||-NONE-|||0
A 7 7|||M:NOUN|||something|||REQUIRED|||-NONE-|||0
A 7 7|||M:PUNCT|||,|||REQUIRED|||-NONE-|||0
Since the hypothesis correctly inserted a comma after somebody
however, it looks as though it matches both the comma at the start of the phrase and the comma at the end of the phrase. The fix would thus be to make sure the reference edits are not split into several smaller edits, but sadly, I know I can't do that without negatively affecting other edit alignments.
I agree that it doesn't make sense for the number of TPs to exceed the number of edits, but this looks like a one-in-a-million kind of edge case to me. The errant.Annotator.annotate()
has actually output the correct number of edits from the reference, and if the hypothesis matched the reference, then it should be rewarded with 2 TPs for matching both commas.
In short - there's not a lot I can do about it other than say you found a needle in a haystack!
Thank you for your reply. Now I understand that this behavior is reasonable from your kind explanation.
this looks like a one-in-a-million kind of edge case
I agree. We would almost never encounter such a case :joy:
Thanks again!
Hi, I have a question about duplicate corrections.
errant_parallel
sometimes makes duplicate corrections, e.g.(The above is line 612 of JFLEG-dev. The reference is the first annotation.) In the above case,
errant_compare
showsHowever,
hyp.m2
has only three correction, so TP=4 is strange.The reason of this is the duplicate corrections in the reference.
Actually,
ref.m2
has two lines ofA 7 7|||M:PUNCT|||,|||REQUIRED|||-NONE-|||0
. (I don't know why such duplication appears.)During
errant_compare
, thecoder_dict[coder][(7, 7, ',')]
has multiple values:['M:PUNCT', 'M:PUNCT']
. This adds two points to the evaluation score becauseref_edits[h_edit]
has two values (in here)Is it expected? Personally, I do not think it is desirable for the number of TP to exceed the number of edits of a hypothesis. Possible solutions would be to
errant.Annotator.annotate()
from outputting duplicate corrections.coder_dict
variable inerrant.commands.compare_m2.py
only has a single value (now it is a list).Thank you for your development of ERRANT! (This is an aside, but I am developing an API-based errant_compare and noticed this problem because the my results did not match the official results.)