EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
6.64k stars 1.76k forks source link

RACE dataset error? #835

Open RanchiZhao opened 1 year ago

RanchiZhao commented 1 year ago

I am not sure that in your lm_eval/tasks/race.py is this function an error?

        text = "Article: " + doc["article"] + "\n\n"
        for problem in doc["problems"][:-1]:
            if problem["question"][-6:] == "  _  .":
                text += (
                    problem["question"][-5:] + self.get_answer_option(problem) + "\n"
                )
            else:
                question = "Question: " + problem["question"] + "\n"
                answer = "Answer: " + self.get_answer_option(problem) + "\n"
                text += question + answer
        text += self.last_problem(doc)["question"]
        return text
text += (problem["question"][-5:] + self.get_answer_option(problem) + "\n")

should be like

text += (problem["question"][:-5] + self.get_answer_option(problem) + "\n")

to get the whole text

RanchiZhao commented 1 year ago

[:-6] maybe is better

StellaAthena commented 11 months ago

I think text += problem["question"][:-5] is correct. Thanks for catching this, can you open a PR with the change? In the PR, please add some examples showing how it improves formatting.

RanchiZhao commented 11 months ago

oh yeah, i will do this soon

StellaAthena commented 11 months ago

@RanchiZhao Hey, just wanted to check in and find out if this was on the horizon still.

liewziqin commented 2 months ago

Hi, I can help to open a PR for this change

StellaAthena commented 2 months ago

Hi, I can help to open a PR for this change

I want do confirm with @haileyschoelkopf that this is still desirable, but AFAIK that would be very helpful.