abacaj code-eval issues - Githubissues

abacaj / code-eval

Run evaluation on LLMs using human-eval benchmark

MIT License

379 stars 36 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

where the evaluate_functional_correctness

#17 invade-art opened 4 months ago
0
No GPU Found

#16 qxpBlog closed 7 months ago
0
where can i get the result of perfrmance after evaluating the llama2-7b

#15 qxpBlog closed 9 months ago
0
Any update on these metrics?

#14 zhimin-z opened 11 months ago
0
Is llama2-7B-chat weaker thank llama2-7B?

#13 sunyuhan19981208 opened 12 months ago
1
Update eval_llama.py

#12 acrastt closed 1 year ago
0
Any plans on running evals for codellama?

#11 ErikBjare opened 1 year ago
2
Performance of llama-2

#10 junzhang-zj closed 1 year ago
8
Support CodeGeeX2

#9 exceedzhang opened 1 year ago
2
add extra models

#8 abacaj closed 1 year ago
0
fix command for evaluate_functional_correctness

#2 tmm1 closed 1 year ago
1