Evaluate lines with newline characters

🐛 Bug

Currently, the library does not allow sentences that contain newline characters (i.e. '\n') but rather will split them into subsentences to compute scores. This is due to how the input sentences are read (e.g. see here for the scorer code). A better way to achieve this would be to simply read the files as binary and then apply decoding to the individual lines. Happy to contribute with a small PR if you feel like this might be useful to other users.

@ricardorei

To Reproduce

Simply executing COMET on the input files, either via scoring or compare.

Expected behaviour

If I have a file that consists of 1000 lines (i.e. wc -l output_it/src.txt), I would expect exactly 1000 sentence-level scores.

Environment

OS: Ubuntu 20.04.5 LTS (Focal Fossa) Python 3.8.16 via Conda

Unbabel / COMET