abacaj / code-eval

Run evaluation on LLMs using human-eval benchmark
MIT License
362 stars 34 forks source link

Any update on these metrics? #14

Open zhimin-z opened 8 months ago

zhimin-z commented 8 months ago

image