issues
search
abacaj
/
code-eval
Run evaluation on LLMs using human-eval benchmark
MIT License
379
stars
36
forks
source link
where the evaluate_functional_correctness
#17
Open
invade-art
opened
3 months ago