issues
search
bigcode-project
/
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
697
stars
176
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Fix unnecessary repeated overwrite
#249
nielstron
opened
3 days ago
0
Updating SparseML loading to SparseAutoModel
#248
abhinavnmagic
closed
1 week ago
0
Fix: Leaderboard submission Documentation
#247
anil-gurbuz
closed
1 week ago
0
MBPP Llama3-8B-Instruct lower pass@1 score expected
#246
YangZhou08
opened
2 weeks ago
1
Using custom prompts and postprocessing
#245
anil-gurbuz
opened
2 weeks ago
1
Adding support for transformers>=4.40.2 to avoid crash with mbpp
#244
meher-m
closed
1 week ago
2
Value error regarding to "--max_length_generation"
#243
aladinggit
closed
3 weeks ago
1
Following submission documentation fails to save generated model outputs for our(Jdoodle) private HF model
#242
anil-gurbuz
closed
1 week ago
3
fix: rust timeout exception
#241
seobeomjin
closed
1 week ago
1
Some questions about APPS
#240
virt9
closed
2 weeks ago
1
Changes
#239
meher-m
closed
1 month ago
0
Add a new dataset Mercury
#238
Elfsong
closed
1 month ago
2
How can I pass num_beams?
#237
Sunt-ing
closed
3 weeks ago
2
Fix MBPP bug with transformers 4.38+
#236
edgan8
closed
1 week ago
6
Run 70b evaluation.
#235
icoderzqliu
closed
1 month ago
0
API-based evaluation support (humanevalpack_openai.py is too old)
#234
s-natsubori
opened
1 month ago
0
The result of llama2-13b-chat(pass@1 18) is worse than paper(pass@1 37)
#233
moyi-qwq
closed
1 month ago
2
Add package name and enforce python version in `setup.py`
#232
shehrozek-cerebras
closed
1 month ago
4
Make `bigcode_eval` pip installable
#231
shehrozek-cerebras
closed
1 week ago
1
If I want to add my own designed prompts before each question, how should I modify the code
#230
ALLISWELL8
opened
2 months ago
1
Add StudentEval from LLM4Code 2024
#229
arjunguha
closed
2 months ago
0
The results of Llama3-8b pass@1 is worse than report
#228
shuaiwang2022
opened
2 months ago
6
ignore --use_auth_token if model doesn't require it
#227
Vipitis
opened
2 months ago
0
[FR] include "config" data in generations_only
#226
Vipitis
opened
2 months ago
0
fix: Multiple-E dataset fix go_test.go path for test execution
#225
hitesh-1997
opened
2 months ago
3
Multiple-E Go test file name suffix does not contain _test.go
#224
hitesh-1997
opened
2 months ago
0
refactor(evalplus): maintain mbpp+ v0.2.0
#223
ganler
closed
2 months ago
0
Add llama3 instruction prompts
#222
TechxGenus
opened
2 months ago
0
Support for vLLM
#221
noforit
opened
2 months ago
0
The results of codellama-7b-hf pass@1 is worse than paper
#220
PeiqinSun
closed
2 months ago
2
Add instruct models prompts
#219
loubnabnl
closed
2 months ago
0
run the MBPP in the HumanEval data format
#218
virt9
closed
1 month ago
0
Leaderboard README improvements
#217
nikita1503
opened
2 months ago
0
remove pad tokens added by the accelerator.pad_across_processes
#216
IQ17
opened
2 months ago
0
Please add flag to log score for each sample (akin to Eleuther's LM Evaluation Harness)
#215
RylanSchaeffer
opened
2 months ago
3
Finetune starcoderbase-1b
#214
SummCoder
opened
2 months ago
0
MultiPL-E generations step is hung
#213
Santhoshkumar-p
opened
3 months ago
0
Ensure generations get saved in generation_only mode
#212
Vipitis
opened
3 months ago
0
Check pass/fail count for humaneval
#211
toptechie156
closed
2 months ago
1
请问使用vllm评测时怎么实现类似HF多卡数据并行?
#210
noforit
closed
3 months ago
0
why change n_copies from 1 to 2?
#209
Reeleon
opened
3 months ago
0
Add prompt
#208
Muennighoff
closed
3 months ago
0
max_length_generation parameter
#207
icoderzqliu
closed
1 week ago
4
fix apps evaluate error: local variable 'level' referenced before assignment
#206
koking0
opened
3 months ago
0
How to use my local model files to run the program? I cant download the model online
#205
virt9
closed
3 months ago
4
Update README.md
#204
AnitaLiu98
opened
4 months ago
4
Support for HumanEval-Infilling Benchmark
#203
Hambaobao
closed
3 months ago
1
[Urgent Issue] Cannot run HumanEval benchmarking on CodeLlama model
#202
cosmo3769
closed
3 months ago
6
Add issue prompt
#201
Muennighoff
closed
4 months ago
0
DS-F-ENG (ignore the name) evaluation task
#200
sfc-gh-ajedrosz
closed
4 months ago
0
Next