Created the compare_translations.py script and added it to silnlp/common
The script takes 2 required arguments and 1 optional
dir-path - the path to the directory where the translations are found and where the comparison output will be stored
file-paths - the paths to each of the translation outputs that need to be compared, takes exactly 2 file paths
scorers - optional argument that contains the set of scorers to use for comparison, default is:
{"bleu", "chrf3", "chrf3+", "chrf3++", "spbleu", "wer", "ter"}
The output of this script is stored in comparison_scores.txt at dir-path and contains the relevant filenames and each scorer and result used.
I did not add an S3 bucket connection yet, but this script can be extended later on to add that functionality if needed.
I did not add an S3 bucket connection yet, but this script can be extended later on to add that functionality if needed.
This change is