Human Evaluation Data - Githubissues

matt-seb-ho / WikiWhy

WikiWhy is a new benchmark for evaluating LLMs' ability to explain between cause-effect relationships. It is a QA dataset containing 9000+ "why" question-answer-rationale triplets.

MIT License

42 stars 1 forks source link

Human Evaluation Data #6

Open inimah opened 1 year ago

inimah commented 1 year ago

Hi, Thanks for sharing your great work!

Would you also publicly release the human evaluation outcomes?

Best,

matt-seb-ho commented 1 year ago

Hi, thanks for reaching out! We are currently working on a version 1.2 of the dataset, and are considering releasing the human evaluations at the same time. Thank you for your patience.