issues
search
bigcode-project
/
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
702
stars
180
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
make HumanEvalSynthesize error more informative
#151
loubnabnl
closed
8 months ago
0
Phi 1.5 evaluation problem
#150
Anindyadeep
closed
8 months ago
2
Update README.md
#149
loubnabnl
closed
8 months ago
0
Is it possible to run the harness against API hosted models?
#148
pnewhook
opened
8 months ago
3
Multi card operation large model
#147
ALLISWELL8
closed
1 week ago
2
code
#146
Lzzzzx
closed
7 months ago
1
solved
#145
ALLISWELL8
closed
8 months ago
0
fixed the name of instruct-humaneval
#144
Keytoyze
closed
9 months ago
0
Add --save_references_path into args
#143
iCSawyer
closed
6 months ago
2
The results are different from those of the codellama paper
#142
ALLISWELL8
closed
1 week ago
6
pip install -e . Error
#141
jihyeleekr
closed
9 months ago
0
args.batch_size vs args.num_samples
#140
awasthiabhijeet
closed
9 months ago
2
code problem
#139
ALLISWELL8
closed
7 months ago
1
Cannot Reproduce SantaCoder pass@1 on HumanEval
#138
ZhangzihanGit
closed
9 months ago
2
code,How to Test Your Private Dataset
#137
ALLISWELL8
closed
7 months ago
1
Evaluation of the task humanevalexplaindescribe
#136
wang-weishi
closed
9 months ago
5
allow auto for max_memory_per_gpu
#135
loubnabnl
closed
10 months ago
0
StarCoderCommit Prompt
#134
awasthiabhijeet
closed
10 months ago
2
Add WizardCoder models (that are CodeLLama based) evaluation
#133
loubnabnl
closed
9 months ago
2
fix typo in MultiPL-E bash script
#132
ChiYeungLaw
closed
10 months ago
0
'HumanEval' object has no attribute 'dataset'
#131
dongguanting
closed
4 months ago
5
add codellama prompt to humanevalsynthesize
#130
loubnabnl
closed
10 months ago
2
ModuleNotFoundError: No module named 'lm_eval.ds'
#129
dhingratul
closed
10 months ago
2
print precision in correct setting
#128
loubnabnl
closed
10 months ago
0
Recode: perturbed human-eval
#127
RaymondLi0
closed
10 months ago
1
Fix str concat
#126
Muennighoff
closed
10 months ago
0
Update stop tokens for mbpp and humaneval
#125
VikParuchuri
closed
10 months ago
0
fix post-processing of mbpp
#124
loubnabnl
closed
11 months ago
0
update leaderboard guide
#123
loubnabnl
closed
11 months ago
0
add guide for reproducing leaderboard results and submitting new ones
#122
loubnabnl
closed
11 months ago
0
bug in post processing for mbpp task
#121
weiliang-zeng
closed
11 months ago
3
Add HumanEvalPack, QuixBugs, Python Bugs, Parity
#120
Muennighoff
closed
10 months ago
1
Different stop words between humaneval and instruct humaneval
#119
GinaJihyeonLee
closed
11 months ago
1
skipdecode for starcoder (harness)
#118
danielkorat
closed
11 months ago
0
installation issue with pip install git+https://github.com/bigcode-project/bigcode-evaluation-harness.git
#117
changwangss
closed
11 months ago
2
improve install
#116
changwangss
closed
11 months ago
1
Adding additional optional args for decoding flags and AutoModel kwargs to support models like ReplitLM
#115
madhavatreplit
opened
12 months ago
0
Support for unmerged peft adapters
#114
cassanof
closed
11 months ago
1
Update MultiPL-E docker image
#113
loubnabnl
opened
1 year ago
0
Add missing MultiPL-E imports
#112
loubnabnl
closed
1 year ago
0
Doc update - `max_length_generation` requirement for GSM tasks
#111
infinitylogesh
closed
1 year ago
0
Reproduction of HELM results
#110
tangzhy
closed
1 year ago
1
fixed a couple of typos
#109
didier-durand
closed
1 year ago
1
When I evaluated the dataset APPS, I got the error RuntimeError: stack.size() >= frames.back().function->n_inputs INTERNAL ASSERT FAILED
#108
Luowaterbi
closed
1 year ago
0
Update and rename openai.py to openai_generation.py
#107
terryyz
closed
11 months ago
0
MBPP eval extremely slow for CodeGen2 and Replit-Code
#106
AadSah
closed
11 months ago
4
error: list index out of range, when testing in multi-gpu?
#105
wwngh1233
closed
1 week ago
6
Support Seq2SeqLM model class (to facilitate the CodeT5+ models)
#104
keyboardAnt
opened
1 year ago
0
how to use --instruction_tokens?
#103
wwngh1233
closed
11 months ago
1
Support `Salesforce/codet5p-220m` and other `T5ForConditionalGeneration` models
#102
keyboardAnt
closed
1 week ago
1
Previous
Next