hkust-nlp / felm

Github repository for "FELM: Benchmarking Factuality Evaluation of Large Language Models" (NeurIPS 2023)
54 stars 2 forks source link

Q about dataset #3

Open 141forever opened 10 months ago

141forever commented 10 months ago

Hi, I am very interested in your work! It helps me a lot. Whether you have annotated the error types of segments in each example? I saw the error types in the paper and Typical Data Point of page https://github.com/hkust-nlp/felm, but i have not seen them in https://huggingface.co/datasets/hkust-nlp/felm/blob/main/all.jsonl.

shiqichen17 commented 10 months ago

Hi, It depends on specific domains of the data samples. In the case of the Math and Reasoning domain, errors are automatically classified under "reasoning error" by default. For the remaining three domains, error types are explicitly annotated in the jsonl file. Thanks.