Closed eabase closed 4 months ago
Thank you for bringing this to our attention! The example you mentioned is not a good case, and we will remove it. However, it will still be counted as a correct answer based on our extraction logic.
@NipElement
However, it will still be counted as a correct answer based on our extraction logic.
Sorry, I don't understand.
Why would you "count" something that is wrong, as being correct?
Hi All, it is a parsing error in our answer extraction code part. It will introduce about 1% errors for some models.
To avoid such issues, people who are interested in evaluating their own models can extract the response themselves and submit the extracted answer directly, instead of the raw response.
In this case, such parsing errors will not be introduced.
Looking through the Correct Examples I came across the genetics example, and it is wrong.
It would be interesting to know how you guys are wetting what is considered to be correct?