bigcode-project bigcode-evaluation-harness issues

bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.

Apache License 2.0

698 stars 180 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Add prompts

#199 Muennighoff closed 4 months ago
0
Support for StudentEval Dataset (Again)

#198 guanqun-yang opened 4 months ago
1
AATK process_results is missing

#197 adiprasad opened 4 months ago
4
Fix loading PAL-GSM few-shot examples

#196 sxjscience opened 4 months ago
0
To evaluate Github copilot?

#195 liw8hz opened 5 months ago
0
add support for codellama-70b prompt

#194 loubnabnl closed 4 months ago
0
[FEATURE REQUEST] Support HumanEval+ tests for MultiPL-E

#193 Randl opened 5 months ago
0
Potentially extra slow inference when using LoRA adapter

#192 sadaisystems opened 5 months ago
1
Out-of-memory of multi-gpu evaluation

#191 yifan-bao closed 1 week ago
1
Add mbpp+ evaluation task

#190 ganler closed 5 months ago
0
Make `main.py` compatible with OpenAI compatible APIs

#189 hmellor opened 5 months ago
4
Why the result of using multiple evaluation is 0

#188 shuaiwang2022 closed 1 week ago
1
Add humaneval+ evaluation task

#187 ganler closed 5 months ago
2
Supporting HumanEval+ dataset

#186 ganler closed 5 months ago
1
update warning failed dataset loading

#185 loubnabnl closed 5 months ago
0
support left padding and prefix post-processing for models like chatglm

#184 loubnabnl closed 5 months ago
0
Question regarding stop_at_stop_tokens method

#183 phqtuyen closed 1 week ago
1
errors when ran THUDM/chatglm3-6b on this project

#182 kimheeyori closed 5 months ago
2
set pyext==0.5 to avoid module 'inspect' has no attribute 'getargspec'

#181 phuonglvh closed 5 months ago
0
add <file_sep> to stop tokens

#180 loubnabnl closed 5 months ago
0
Endpoints Integration to evaluate closed source Models.

#179 Anindyadeep closed 4 weeks ago
3
Pass trust_remote_code to load_dataset

#178 Vipitis closed 5 months ago
1
Fix typo in README.md

#177 ab-10 opened 6 months ago
0
Use trust_remote_code for dataset load

#176 Vipitis closed 5 months ago
3
Why doesn't mbpp use few-shot?

#175 shuaiwang2022 closed 5 months ago
1
Human_eval .yaml

#174 inigopm closed 5 months ago
1
[WIP] Shadereval tasks

#173 Vipitis opened 6 months ago
0
Try to reproduce the leaderboard result of codellama-instruct-34B for Rust

#172 teamplayer3 closed 6 months ago
6
Update stop words

#171 loubnabnl closed 5 months ago
0
fix starcoder fim constructor

#170 maxmatical closed 7 months ago
0
Evaluation speed with multi-gpu

#169 Cheungki opened 7 months ago
2
OOM when running Multi-gpu inference using A100 40GB

#168 phqtuyen closed 1 week ago
1
Reproduction Issues of Code Llama

#167 VoiceBeer closed 7 months ago
3
QOL changes for generations

#166 maxmatical closed 5 months ago
0
Inconsistencies in results in Mistral, CodeLlama and some strange behavior.

#165 Anindyadeep closed 7 months ago
11
SantaCoder FIM task

#164 maxmatical closed 7 months ago
1
Anyway to access to the public leaderboard?

#163 zhimin-z closed 7 months ago
1
exact commandline to reproduce leaderboard

#162 Derekglk opened 7 months ago
2
A common interface for APIs and Models.

#161 Anindyadeep opened 7 months ago
9
Add support for Ollama, Palm, Claude-2, Cohere, Replicate, Llama2 CodeLlama (100+LLMs) [LiteLLM]

#160 ishaan-jaff opened 7 months ago
3
Score discrepancy with humaneval

#159 zhksh closed 7 months ago
2
Evaluation of instruct model

#158 phqtuyen closed 7 months ago
8
Dockerfile-multiple no longer fetches pip dependencies needlessly

#157 RemcoSchrijver opened 8 months ago
0
Move code_eval metric from Evaluate to BigCode-eval-harness

#156 thomwolf closed 8 months ago
0
revert datasets upgrade

#155 loubnabnl closed 8 months ago
0
pin fsspec version to fix CI

#154 loubnabnl closed 8 months ago
0
Humanevalpack-c++ issue

#153 Gene-su-GGN closed 1 week ago
1
rename lm_eval => bigcode_eval

#152 thomwolf closed 8 months ago
2
make HumanEvalSynthesize error more informative

#151 loubnabnl closed 8 months ago
0
Phi 1.5 evaluation problem

#150 Anindyadeep closed 8 months ago
2

Previous Next