potsawee / selfcheckgpt

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
MIT License
467 stars 54 forks source link

Contradiction Threshold for NLI Approach #17

Closed ktangri closed 1 year ago

ktangri commented 1 year ago

Hi @potsawee, great paper and thanks for the easy to use library! One quick question, for the NLI results you show in this repo, what is the probability threshold you are using to determine factual vs. non-factual?

Thanks!

potsawee commented 1 year ago

Hi @ktangri thanks for checking out the repository!

In the reported experiments, we use AUC-PR for the sentence-level detection task and Correlation for the passage-level detection task. These evaluations don't require a threshold to be set.

And if you want to obtain an optimal threshold (e.g., say for a deployment etc), you could, for example, optimise on some metric to find the optimal threshold value on some development set. If we optimise F1-score on the annotated dataset in my work using , the thresholds are:

Target Threshold
Non-Factual P(contradict) = 0.5397
Factual P(entail) = 1-P(contradict) = 0.2948

Note that in the implementation, only the logits of "entail" and "contradict" classes are used, so $P(\text{entail}) + P(\text{contradict}) = 1.0$.

Hope this answers your question.

ktangri commented 1 year ago

Ahh I see, thank you for the clarification! I arrived at a similar threshold for a different benchmark which is a nice confirmation.