AkihikoWatanabe / paper_notes

たまに追加される論文メモ
https://AkihikoWatanabe.github.io/paper_notes
17 stars 0 forks source link

Why We Need New Evaluation Metrics for NLG, EMNLP'17 #989

Open AkihikoWatanabe opened 1 year ago

AkihikoWatanabe commented 1 year ago

https://aclanthology.org/D17-1238/

AkihikoWatanabe commented 1 year ago

The majority of NLG evaluation relies on automatic metrics, such as BLEU . In this paper, we motivate the need for novel, system- and data-independent automatic evaluation methods: We investigate a wide range of metrics, including state-of-the-art word-based and novel grammar-based ones, and demonstrate that they only weakly reflect human judgements of system outputs as generated by data-driven, end-to-end NLG. We also show that metric performance is data- and system-specific. Nevertheless, our results also suggest that automatic metrics perform reliably at system-level and can support system development by finding cases where a system performs poorly.

Translation (by gpt-3.5-turbo)

AkihikoWatanabe commented 1 year ago

既存のNLGのメトリックがhuman judgementsとのcorrelationがあまり高くないことを指摘した研究