potsawee / selfcheckgpt

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
MIT License
442 stars 54 forks source link

about passage-level human annotations #21

Closed 141forever closed 6 months ago

141forever commented 9 months ago

could you show us the human annotation score result of each passage? (which is used to calculate the Pearson and Spearman)

potsawee commented 9 months ago

Hi @141forever

To obtain the passage-level score, we simply took the average of sentence-level scores, e.g., using np.mean(). We didn't perform weighting or anything extra.

potsawee commented 9 months ago

To obtain the sentence-level score from labels, you can use the following mapping {"major_inaccurate": 1.0, "minor_inaccurate": 0.5, "accurate": 0.0}

141forever commented 9 months ago

thanks a lot! best wishes!