Open xlxwalex opened 2 months ago
Hi,
dumps
and results
directory. Would be great if you could confirm that it also work on your side.Regarding 2), we keep this issue open and update you as soon we finished the integration.
Thank you for the prompt reply. The new code can correctly output the result CSV file.
Regarding the second point, I have a question: I noticed that evaluate.py on Line 16 selects the first 10 result_files[:10]
to display. If I need to obtain the results for each category by sending an email, would it be sufficient to simply remove the [:10], export the result CSV, and then send it to you?
Good to hear!
Removed the mentioned [:10]
. Yes just send us the csv file, we can evaluate it on our side the next days.
Thank you very much. After I finish testing, I will send the results to you.
Hello, I have two questions regarding
evaluate.py
:I noticed that
investigate.py
callsevaluate.py
when gathering data at the end, which results in the following error:This seems to be due to an extra hyphen in the
replace
on Line 26. Modifying it to the following code should correctly export the results to a CSV table:In the Leaderboard, each model should have a score for each task category, but
evaluate.py
does not appear to support this feature. Could you please advise on how I can obtain the model's scores for Overall, Discourse, Morphology, Reasoning, Semantics, and Syntax?I am eagerly awaiting your reply.