Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Instruction Tuning with GPT-4
https://instruction-tuning-with-gpt-4.github.io/
Apache License 2.0
4.22k stars 301 forks source link

About Human evaluation #9

Closed hbin0701 closed 1 year ago

hbin0701 commented 1 year ago

Dear Authors, Thanks for the great work!

I have questions about the details of human evaluation. In the paper, it says it was done in Amazon Mechanical Turk, "consider[ing] 252 user-oriented instructions for evaluation."

I am curious about how many people participated in this evaluation - and whether each participant was required to answer all 252 instructions for evaluation.

Thank you in advance!

pengbaolin commented 1 year ago

Q: I am curious about how many people participated in this evaluation A: The exact number of participants is not clear, but roughly 40 Master level Turks took part.

Q: and whether each participant was required to answer all 252 instructions for evaluation. A: No

We have conducted several rounds of human evaluations. The results are prety consistent.