I wanted to clarify the following information. On the checkpoints page here, you mention that
Currently, the following six BLEURT checkpoints are available, fine-tuned on WMT Metrics ratings data from 2015 to 2018. They vary on two aspects: the size of the model, and the size of the input.
Let's say I am using the following model - BLEURT-Base, 512 (max #tokens). In my case, both generated text and reference text are longer than 512 tokens. While computing the BLEURT, will it automatically truncate both generated text and reference text to fit the requirement and then calculate the score between truncated versions of generated text and reference text? Or do I need to cut the length of generated text and reference text manually before calling the function to calculate BLEURT?
Hi,
I wanted to clarify the following information. On the checkpoints page here, you mention that
Let's say I am using the following model - BLEURT-Base, 512 (max #tokens). In my case, both generated text and reference text are longer than 512 tokens. While computing the BLEURT, will it automatically truncate both generated text and reference text to fit the requirement and then calculate the score between truncated versions of generated text and reference text? Or do I need to cut the length of generated text and reference text manually before calling the function to calculate BLEURT?
Many thanks in advance, Ruslan