Issues with the comparision_data.json data

Hi, Thanks a lot for sharing the data, and I have one question on the comparision_data.json. In this dataset, it is said that completion_a is ranker higher than completion_b, however, when I checked into the data, I found in many cases, completion_b is actually better. For example of this one: { "user_input": "Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nEvaluate this sentence for spelling and grammar mistakes\n\n### Input:\nHe finnished his meal and left the resturant\n\n### Response:", "completion_a": "He finished his meal and left the restaurant.", "completion_b": "There are two spelling errors in the sentence. The corrected sentence should be: \"He finished his meal and left the restaurant.\"" }

Could you help verify that completion_a or completion_b is better than the other? Thanks!

Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Issues with the comparision_data.json data #13