How to reproduce the humaneval performance in this repo ?
I tried to evaluate using the evaluation branch, but it seems to be very different from the the default branch, is it possible to add a full evaluation process in default branch ?
Thanks a lot.
How to reproduce the humaneval performance in this repo ? I tried to evaluate using the evaluation branch, but it seems to be very different from the the default branch, is it possible to add a full evaluation process in default branch ? Thanks a lot.