issues
search
bigcode-project
/
bigcode-evaluation-harness
A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
825
stars
219
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Make `main.py` compatible with OpenAI compatible APIs
#189
hmellor
opened
10 months ago
5
Why the result of using multiple evaluation is 0
#188
shuaiwang2022
closed
5 months ago
1
Add humaneval+ evaluation task
#187
ganler
closed
10 months ago
2
Supporting HumanEval+ dataset
#186
ganler
closed
10 months ago
1
update warning failed dataset loading
#185
loubnabnl
closed
10 months ago
0
support left padding and prefix post-processing for models like chatglm
#184
loubnabnl
closed
10 months ago
0
Question regarding stop_at_stop_tokens method
#183
phqtuyen
closed
5 months ago
1
errors when ran THUDM/chatglm3-6b on this project
#182
kimheeyori
closed
10 months ago
2
set pyext==0.5 to avoid module 'inspect' has no attribute 'getargspec'
#181
phuonglvh
closed
10 months ago
2
add <file_sep> to stop tokens
#180
loubnabnl
closed
10 months ago
0
Endpoints Integration to evaluate closed source Models.
#179
Anindyadeep
closed
5 months ago
3
Pass trust_remote_code to load_dataset
#178
Vipitis
closed
10 months ago
1
Fix typo in README.md
#177
ab-10
opened
10 months ago
0
Use trust_remote_code for dataset load
#176
Vipitis
closed
10 months ago
3
Why doesn't mbpp use few-shot?
#175
shuaiwang2022
closed
10 months ago
1
Human_eval .yaml
#174
inigopm
closed
10 months ago
1
[WIP] Shadereval tasks
#173
Vipitis
opened
11 months ago
0
Try to reproduce the leaderboard result of codellama-instruct-34B for Rust
#172
teamplayer3
closed
11 months ago
6
Update stop words
#171
loubnabnl
closed
10 months ago
0
fix starcoder fim constructor
#170
maxmatical
closed
11 months ago
0
Evaluation speed with multi-gpu
#169
Cheungki
opened
12 months ago
2
OOM when running Multi-gpu inference using A100 40GB
#168
phqtuyen
closed
5 months ago
1
Reproduction Issues of Code Llama
#167
VoiceBeer
closed
12 months ago
3
QOL changes for generations
#166
maxmatical
closed
10 months ago
0
Inconsistencies in results in Mistral, CodeLlama and some strange behavior.
#165
Anindyadeep
closed
1 year ago
11
SantaCoder FIM task
#164
maxmatical
closed
1 year ago
1
Anyway to access to the public leaderboard?
#163
zhimin-z
closed
1 year ago
1
exact commandline to reproduce leaderboard
#162
Derekglk
opened
1 year ago
2
A common interface for APIs and Models.
#161
Anindyadeep
opened
1 year ago
9
Add support for Ollama, Palm, Claude-2, Cohere, Replicate, Llama2 CodeLlama (100+LLMs) [LiteLLM]
#160
ishaan-jaff
opened
1 year ago
3
Score discrepancy with humaneval
#159
zhksh
closed
1 year ago
2
Evaluation of instruct model
#158
phqtuyen
closed
1 year ago
8
Dockerfile-multiple no longer fetches pip dependencies needlessly
#157
RemcoSchrijver
opened
1 year ago
0
Move code_eval metric from Evaluate to BigCode-eval-harness
#156
thomwolf
closed
1 year ago
0
revert datasets upgrade
#155
loubnabnl
closed
1 year ago
0
pin fsspec version to fix CI
#154
loubnabnl
closed
1 year ago
0
Humanevalpack-c++ issue
#153
Gene-su-GGN
closed
5 months ago
1
rename lm_eval => bigcode_eval
#152
thomwolf
closed
1 year ago
2
make HumanEvalSynthesize error more informative
#151
loubnabnl
closed
1 year ago
0
Phi 1.5 evaluation problem
#150
Anindyadeep
closed
1 year ago
2
Update README.md
#149
loubnabnl
closed
1 year ago
0
Is it possible to run the harness against API hosted models?
#148
pnewhook
opened
1 year ago
3
Multi card operation large model
#147
ALLISWELL8
closed
5 months ago
2
code
#146
Lzzzzx
closed
1 year ago
1
solved
#145
ALLISWELL8
closed
1 year ago
0
fixed the name of instruct-humaneval
#144
Keytoyze
closed
1 year ago
0
Add --save_references_path into args
#143
iCSawyer
closed
11 months ago
2
The results are different from those of the codellama paper
#142
ALLISWELL8
closed
5 months ago
6
pip install -e . Error
#141
jihyeleekr
closed
1 year ago
0
args.batch_size vs args.num_samples
#140
awasthiabhijeet
closed
1 year ago
2
Previous
Next