issues
search
abacaj
/
code-eval
Run evaluation on LLMs using human-eval benchmark
MIT License
379
stars
36
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
where the evaluate_functional_correctness
#17
invade-art
opened
4 months ago
0
No GPU Found
#16
qxpBlog
closed
7 months ago
0
where can i get the result of perfrmance after evaluating the llama2-7b
#15
qxpBlog
closed
9 months ago
0
Any update on these metrics?
#14
zhimin-z
opened
11 months ago
0
Is llama2-7B-chat weaker thank llama2-7B?
#13
sunyuhan19981208
opened
12 months ago
1
Update eval_llama.py
#12
acrastt
closed
1 year ago
0
Any plans on running evals for codellama?
#11
ErikBjare
opened
1 year ago
2
Performance of llama-2
#10
junzhang-zj
closed
1 year ago
8
Support CodeGeeX2
#9
exceedzhang
opened
1 year ago
2
add extra models
#8
abacaj
closed
1 year ago
0
fix command for evaluate_functional_correctness
#2
tmm1
closed
1 year ago
1