AkihikoWatanabe commented 1 year ago

https://aclanthology.org/2021.findings-emnlp.395/

AkihikoWatanabe commented 1 year ago

In this paper we propose QACE, a new metric based on Question Answering for Caption Evaluation to evaluate image captioning based on Question Generation(QG) and Question Answering(QA) systems. QACE generates questions on the evaluated caption and check its content by asking the questions on either the reference caption or the source image. We first develop QACE_Ref that compares the answers of the evaluated caption to its reference, and report competitive results with the state-of-the-art metrics. To go further, we propose QACE_Img, that asks the questions directly on the image, instead of reference. A Visual-QA system is necessary for QACE_Img. Unfortunately, the standard VQA models are actually framed a classification among only few thousands categories. Instead, we propose Visual-T5, an abstractive VQA system. The resulting metric, QACE_Img is multi-modal, reference-less and explainable. Our experiments show that QACE_Img compares favorably w.r.t. other reference-less metrics.

Translation (by gpt-3.5-turbo)

本論文では、画像キャプションの評価において、Question Generation（QG）とQuestion Answering（QA）システムに基づいた質問応答メトリックであるQACEを提案します。QACEは評価対象のキャプションに対して質問を生成し、その内容を参照キャプションまたはソース画像に対して質問をすることで確認します。まず、評価対象のキャプションの回答を参照キャプションと比較するQACE_Refを開発し、最先端のメトリックと競合する結果を報告します。さらに、参照ではなく画像自体に直接質問をするQACE_Imgを提案します。QACE_ImgにはVisual-QAシステムが必要です。ただし、標準のVQAモデルは実際には数千のカテゴリーの分類にフレーム化されています。代わりに、抽象的なVQAシステムであるVisual-T5を提案します。得られたメトリックであるQACE_Imgは、マルチモーダルで参照を必要とせず、説明可能です。実験の結果、QACE_Imgは他の参照を必要としないメトリックと比較して有利な結果を示しました。
Summary (by gpt-3.5-turbo)
本研究では、画像キャプションの評価において、Question Generation（QG）とQuestion Answering（QA）システムに基づいた質問応答メトリックであるQACEを提案する。QACEは評価対象のキャプションに対して質問を生成し、その内容を参照キャプションまたはソース画像に対して質問することで確認する。QACE_Refというメトリックを開発し、最先端のメトリックと競合する結果を報告する。さらに、参照ではなく画像自体に直接質問をするQACE_Imgを提案する。QACE_ImgにはVisual-QAシステムが必要であり、Visual-T5という抽象的なVQAシステムを提案する。QACE_Imgはマルチモーダルで参照を必要とせず、説明可能なメトリックである。実験の結果、QACE_Imgは他の参照を必要としないメトリックと比較して有利な結果を示した。

AkihikoWatanabe commented 11 months ago

Image Captioningを評価するためのQGQAを提案している。candidateから生成した質問を元画像, およびReferenceを用いて回答させ、candidateに基づいた回答と回答の結果を比較することで評価を実施する。

AkihikoWatanabe / paper_notes

QACE: Asking Questions to Evaluate an Image Caption, Lee+, EMNLP'21 #961

Translation (by gpt-3.5-turbo)

Summary (by gpt-3.5-turbo)