shmsw25 / FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"
https://arxiv.org/abs/2305.14251
MIT License
238 stars 32 forks source link

Regarding human evaluation #29

Closed pat-jj closed 8 months ago

pat-jj commented 8 months ago

Hi, thanks for your great work! To reproduce your results of error rate, is it possible for you to release the human evaluation results? If not, could you tell which 500 entities you were using for the experiments?

shmsw25 commented 8 months ago

Hi @pat-jj, thanks for your interest in our work. The human evaluation results are already included in the data we released. You can download it by installing factscore and then typing python -m factscore.download_data. Or, if preferred, you can download it directly from our Google Drive link. Once you download it, check out the folder named labeled, and it includes human evaluation data.

pat-jj commented 8 months ago

I see, will check it out. Thanks a lot!