Liyan06 / AggreFact

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors (ACL 2023)
18 stars 1 forks source link

Integrating human faithfulness ratings from HELM paper. #4

Open UntotaufUrlaub opened 1 year ago

UntotaufUrlaub commented 1 year ago

The team of the HELM paper just shared a data set of doc-summary faithfulness ratings in this issue. The rating is binary and was crowd sourced. The rated docs are from cnn and xsum. The summaries are references or created by some recent models (gpt3 etc). I think this could be integrated into aggrefact to get an even bigger and better benchmark.

I would be interested in discussing opinions whether this is a fit to be integrated into aggrefact and what to consider while doing so.

Liyan06 commented 1 year ago

Happy to discuss this!