Closed XitaoLi closed 1 year ago
More details: the null also appears if the key "improper" is False.
{
"question": "who played cruella de vil in 101 dalmatians",
"golden_answer": "Glenn Close",
"answer_fid": "Glenn Close",
"judge_fid": true,
"answer_gpt35": "Cruella de Vil was played by Glenn Close in the 1996 live-action version of 101 Dalmatians.",
"judge_gpt35": true,
"answer_chatgpt": "Cruella De Vil was played by Glenn Close in the 1996 live-action film \"101 Dalmatians.\"",
"judge_chatgpt": true,
"answer_gpt4": null,
"judge_gpt4": null,
"answer_newbing": "In the 1996 live-action adaptation of \u201c101 Dalmatians\u201d, the character of Cruella de Vil was portrayed by actressGlenn Close1.",
"judge_newbing": true,
"improper": false
},
We did not annotate all the data for NQ-GPT4 because of the GPT-4 API access restriction and only annotated 500, which can be seen in Table 1 of the paper.
As the title described, when i load the dataset and try to reproduct the results as your paper, I found lots of labels are "null".
Is it a mistake?