issues
search
bigcode-project
/
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
744
stars
193
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Program repair
#64
keyboardAnt
opened
1 year ago
3
update humaneval postprocessing
#63
loubnabnl
closed
1 year ago
1
Rename path arguments
#62
loubnabnl
closed
1 year ago
0
remove model from accelerate prepare and add precision argument
#61
loubnabnl
closed
1 year ago
0
Add file name support for our method.
#60
SivilTaram
closed
1 year ago
0
[WIS] Program repair
#59
keyboardAnt
closed
1 year ago
0
Would be better to save generations on the fly
#58
Muennighoff
opened
1 year ago
0
chat bug fixer for humaneval-x-bugs
#57
mitya52
closed
1 year ago
3
[WIP] Add the BugRepair Task evaluation
#56
keyboardAnt
closed
1 year ago
0
new bullet point
#55
ArmelRandy
closed
1 year ago
0
Fix number of copies per task when nsamples is not proportional to batch size
#54
loubnabnl
closed
1 year ago
0
Update main.py
#53
SivilTaram
closed
1 year ago
0
How to evaluate the model memory efficiently?
#52
Godofnothing
closed
1 year ago
6
Error Running Odex Integration Code
#51
murthyrudra
closed
1 year ago
2
Fix stop words
#50
Muennighoff
closed
1 year ago
1
Variable max_length_generation
#49
Muennighoff
opened
1 year ago
0
Query around n_samples argument
#48
murthyrudra
closed
1 year ago
7
Commit / Edit / Diff models & their evaluation
#47
Muennighoff
closed
1 year ago
1
HumanEval post-processing
#46
RaymondLi0
closed
1 year ago
2
add odex and mconala datasets
#45
zorazrw
closed
8 months ago
1
Integrate MultiPL-E
#44
loubnabnl
closed
1 year ago
4
Change bool arguments format
#43
loubnabnl
closed
1 year ago
0
Add setup file
#42
loubnabnl
closed
1 year ago
1
update readme with trust_remote_code arg
#41
loubnabnl
closed
1 year ago
0
add trust_remote_code and use_auth_token args
#40
loubnabnl
closed
1 year ago
0
[WIP; Input Requested] Integrated DS1000 task
#39
benlipkin
closed
1 year ago
6
Support Kotlin in MultiPL-E
#38
arjunguha
closed
1 year ago
0
Add GSM8k - Math Reasoning Dataset to the evaluation tasks
#37
infinitylogesh
closed
1 year ago
7
Add batch evaluation support when batch_size > 1
#36
infinitylogesh
opened
1 year ago
8
Add Reasoning tasks to the evaluation
#35
infinitylogesh
closed
9 months ago
2
Per-PL perplexity vs. pass@k rate
#34
arjunguha
closed
1 year ago
1
Execution-based FIM evaluation
#33
arjunguha
closed
1 year ago
0
fix typos
#32
manandey
closed
1 year ago
0
Preliminary cite
#31
Muennighoff
closed
1 year ago
1
Add metric name to outfile
#30
loubnabnl
closed
1 year ago
0
Port BLEU computation
#29
Muennighoff
closed
1 year ago
0
add template for new tasks
#28
loubnabnl
closed
1 year ago
0
[Minor] Missing task template
#27
JeanKaddour
closed
1 year ago
1
[Minor] Conflicting dependencies in requirements.txt
#26
JeanKaddour
closed
1 year ago
2
fix dependency
#25
manandey
closed
1 year ago
0
Show metric in outfile
#24
Muennighoff
closed
1 year ago
2
support for batch size > 1 for single problem generations (n_samples=1)
#23
Muennighoff
opened
1 year ago
5
Update requirements.txt
#22
Muennighoff
closed
1 year ago
0
Add revision kwarg
#21
Muennighoff
closed
1 year ago
0
[WIP] Add CodeXGLUE-text-to-text benchmark for documentation translation
#20
infinitylogesh
closed
1 year ago
6
Refactor code to separate tasks
#19
loubnabnl
closed
1 year ago
3
Separate generation and evaluation + add CI
#15
loubnabnl
closed
1 year ago
0
main() crashes with --allow-code-execution=True
#14
ocramz
closed
1 year ago
3
Add CodeXGLUE-code-refinement (few-shot) setting
#13
manandey
opened
1 year ago
5
MultiPL-E Integration
#12
loubnabnl
closed
1 year ago
4
Previous
Next