neulab / BARTScore

BARTScore: Evaluating Generated Text as Text Generation
Apache License 2.0
318 stars 37 forks source link

WMT 2019 - Can you provide more details about the split and dataset you used? #27

Open PastelBelem8 opened 2 years ago

PastelBelem8 commented 2 years ago

Hi (:

Great work with this paper! I was wondering if you could provide me with some additional feedback on the dataset you used for MT. If I understand correctly, you used the WMT-19 DARR dataset, but would you mind confirming if this was the validation or test set? Also, were you able to get actual scores for each of the sentences?

Thanks in advance :)