Open ncoop57 opened 2 years ago
I'm interested in this! Are there any directions on how to run the HumanEval evaluation? I looked through the "evaluation" directory briefly but it wasn't immediately obvious what to do.
No, not yet I'm actually creating it right now as there are a few weird dependencies. I'll post here once I have the README up. Would be awesome to get some help from you @neubig on running it as I'm having some weird behavior when running the completions!
Great, I'll try to take a look once there's a bit more doc.
Just uploaded a readme in the evaluation folder that has the steps. Hadn't had the chance to test it but it should hopefully get you able to start evaluating a model
I can reproduce the set-up based on the readme.
Just uploaded a readme in the evaluation folder that has the steps. Hadn't had the chance to test it but it should hopefully get you able to start evaluating a model
Need to update the HumanEval results due to bug that was originally in our evaluation code and was fixed in this PR #62