How to calculate overall bleurt score?

HI,

When i run the code on my own dataset I get the scores file back with a score calculated for each sample. Am i supposed to take average on all the scores generated?

My dataset consists of the reference file with a single line per sample, and a generated summary file also with a single line per sample. I am doing data to text generation, and the reference file is all the expected output summaries, and the generated summary file is all the actual output summaries.

google-research / bleurt

How to calculate overall bleurt score? #28