Closed npecheux closed 9 years ago
Hi there,
thanks for the feedback; Appraise originally did not allow import of identical translation outputs -- these are a by-product of the WMT13 data import.
I'm aware that this is a huge annoyance and will fix it soon.
Cheers, Chris
On Wed, Jun 5, 2013 at 12:28 PM, npecheux notifications@github.com wrote:
Would it be possible to filter out, or to group equal hypothesis. There is no interest in ranking equal outputs, and in fact I loose some time when having 5 hypothesis to find the differences between two of them (many times there are equals, sometimes, they only differ by one character). It would be easier if we knew all outputs to be different.
— Reply to this email directly or view it on GitHubhttps://github.com/cfedermann/Appraise/issues/31 .
Hi Chris. It's not really annoying, but it might be an improvement, e.g. for next year. It is however not so easy to deal with, except when all 5 hypothesis are equal, or buy just indicating with some kind of symbol on the left when two hypothesis are entirely the same. To save some annotation time, it could be interesting to automaticly rank the same all equal hypothesis, with care not to mess up the evaluation statistics. Even more usefull would be to shed light on part of sentences that are equal (e.g. many time the first part until the comma is the same in 4/5 sentences) but this seems a bit challenging, with possible drawback and maybe biasind the evaluation.
Thanks for having taken this issue in consideration so quickly !
Nico
PS: A really annoying feature, but Appraise cannot deal with it is that some translation are the same but for some impossible to see detail (might be a larger space or something like this ?). I'm not sure about. But to see if hypothesis are the same, I look if they are aligned (e.g. the line break happens at the same word). Sometimes they are not, but when I check character by character I don't see any difference. Could this be a Appraise presentaton bug ?
The latter issue you describe should be related to "non-printable" Unicode characters, i.e. wider spaces, etc.
This is now tracked in #45. Closing this issue.
Would it be possible to filter out, or to group equal hypothesis. There is no interest in ranking equal outputs, and in fact I loose some time when having 5 hypothesis to find the differences between two of them (many times there are equals, sometimes, they only differ by one character). It would be easier if we knew all outputs to be different.