mjpost / sacrebleu

Reference BLEU implementation that auto-downloads test sets and reports a version string to facilitate cross-lab comparisons
Apache License 2.0
1.07k stars 164 forks source link

Remove the flores101 related tokenizer logging message when the selected tokenizer is flores200 #219

Open hadyelsahar opened 2 years ago

hadyelsahar commented 2 years ago

This line outputs the following message below which might confuse the user to think that flores101 tokenizer is activated even when the tokenizer flores200 is selected. Better trigger the warning only when spm or flores101 tokenizers are selected https://github.com/mjpost/sacrebleu/blob/734669caa3ab20261843355e1c3aac5a218a6655/sacrebleu/tokenizers/tokenizer_spm.py#L37

Outputted message

2022-10-24 12:27:01 | WARNING | sacrebleu | Tokenizer 'spm' has been changed to 'flores101', and may be removed in the future.