Pattern match failed to capture `(A) A`

mathllm / MATH-V

MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.

MIT License

69 stars 5 forks source link

Hi! I've got some problem when I check the evaluate policy. Generate the answer without a newline, such as (A) something is a common pattern，however it will be interpreted as asomething by evaluate

example(the extraction is expected to be c instead of c60):

experiment: in line 26~27, we have:

                if model_answer.endswith(f" {c}.") or model_answer.endswith(f" ({c}).") or model_answer.startswith(f"{c}\n") or model_answer.startswith(f"({c})\n") or model_answer.startswith(f"({c}) {c}\n"):
                    model_answer = c

by change it into(remove the \n in pattern)

               if response.endswith(f" {c}.") or response.endswith(f" ({c}).") or response.startswith(f"{c}") or response.startswith(f"({c})"):
                    model_answer = c

I got a doubled pass rate in llava-7b(7.24->16.12%)

Question: The pattern seems to be misleading when we try to extract a multiple choice answer, why should we except a newline between choice character and its content or at the end?

Thanks for any help!

mathllm / MATH-V

Pattern match failed to capture `(A) A` #3