issues
search
abacaj
/
code-eval
Run evaluation on LLMs using human-eval benchmark
MIT License
359
stars
34
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
where the evaluate_functional_correctness
#17
invade-art
opened
5 days ago
0
No GPU Found
#16
qxpBlog
closed
3 months ago
0
where can i get the result of perfrmance after evaluating the llama2-7b
#15
qxpBlog
closed
5 months ago
0
Any update on these metrics?
#14
zhimin-z
opened
7 months ago
0
Is llama2-7B-chat weaker thank llama2-7B?
#13
sunyuhan19981208
opened
8 months ago
1
Update eval_llama.py
#12
acrastt
closed
10 months ago
0
Any plans on running evals for codellama?
#11
ErikBjare
opened
10 months ago
2
Performance of llama-2
#10
junzhang-zj
closed
8 months ago
8
Support CodeGeeX2
#9
exceedzhang
opened
11 months ago
2
add extra models
#8
abacaj
closed
11 months ago
0
fix command for evaluate_functional_correctness
#2
tmm1
closed
1 year ago
1