Regarding human evaluation

shmsw25 / FActScore

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

https://arxiv.org/abs/2305.14251

MIT License

292 stars 43 forks source link

Regarding human evaluation #29

Closed pat-jj closed 1 year ago

pat-jj commented 1 year ago

Hi, thanks for your great work! To reproduce your results of error rate, is it possible for you to release the human evaluation results? If not, could you tell which 500 entities you were using for the experiments?

shmsw25 commented 1 year ago

Hi @pat-jj, thanks for your interest in our work. The human evaluation results are already included in the data we released. You can download it by installing factscore and then typing python -m factscore.download_data. Or, if preferred, you can download it directly from our Google Drive link. Once you download it, check out the folder named labeled, and it includes human evaluation data.

pat-jj commented 1 year ago

I see, will check it out. Thanks a lot!