'Confidence score' for a generated summary?

salesforce / ctrl-sum

Resources for the "CTRLsum: Towards Generic Controllable Text Summarization" paper

https://arxiv.org/abs/2012.04281

BSD 3-Clause "New" or "Revised" License

146 stars 24 forks source link

'Confidence score' for a generated summary? #16

Closed aliencaocao closed 2 years ago

aliencaocao commented 2 years ago

❓ Questions and Help

Hi, is there something like a confidence score or a metric that I can use to evaluate how confident the model is in generating the summary? Specifically, I used it to perform question-answering, and some questions may not have an answer in the given text. Is there a way to kind of 'detect' when there is no obvious enough answer in the source text? E.g. when the model makes a less-confident prediciton. Thanks.

jxhe commented 2 years ago

I doubt confidence score would be a trustable metric for this setting. Detecting unanswerable questions is a non-trivial task itself and I doubt CTRLsum is able to do that well off-the-shelf. Actually in the QA literature I think people needed to use annotated unanswerable examples to explicitly train models to identify unanswerable questions, e.g. SQuAD v2

aliencaocao commented 2 years ago

I see, thanks for the info.