neulab / BARTScore

BARTScore: Evaluating Generated Text as Text Generation
Apache License 2.0
324 stars 38 forks source link

Question about D2T datasets #24

Closed taku-ito closed 2 years ago

taku-ito commented 2 years ago

Hi Why do the D2T datasets (BAGEL, SFHOT, SFRES) contain multiple same reference sentences in ref_summs?

yyy-Apple commented 2 years ago

Typically, for a generation task, multiple answers could meet the requirement. Therefore, providing multiple references can also help us to better and more comprehensively assess the quality of the generated text.

taku-ito commented 2 years ago

Sorry, my question was not clear.

My question is about the inclusion of multiple same references, as in the following example.

# BAGEL, id:0
'ref_summs': ['I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'I am sorry but there are no venues near X in the city centre .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .',
   'There are no places you are looking for near X in the centre of town .']}
yyy-Apple commented 2 years ago

The original dataset can be found here: https://github.com/jeknov/EMNLP_17_submission. We take all the references for each mr (meaning representation). Since we take the max when combining the multi-reference results, multiple same references will not affect the final result.

taku-ito commented 2 years ago

Thank you!