cfedermann / Appraise

Appraise evaluation system for manual evaluation of machine translation output
http://www.appraise.cf/
BSD 3-Clause "New" or "Revised" License
74 stars 37 forks source link

Computing clusters with systems with equal output #55

Open tuetschek opened 7 years ago

tuetschek commented 7 years ago

How to compute the system ranking clusters if systems often produce the same output and are merged in the results CSV file? Is using the scripts/compute_ranking_clusters.perl script the correct way?

This script seems to ignore merged systems in the results CSV file (sysA+sysB will be treated as a separate, new system). I have fixed it in this commit in my fork. Was that the correct thing to do, or is there a better way of getting the ranking clusters?

( Without this fix, the clustering script would get stuck in an infinite loop on my data, i.e., several variants of the same NLG system, often producing identical outputs. )