issues
search
huggingface
/
lighteval
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
MIT License
467
stars
54
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
With chat templates, instructions shouldn't be prepended to system prompt
#110
Whadup
closed
3 months ago
5
[IFEVAL] Stopping criteria fails for models with ChatML special tokens
#109
lewtun
closed
3 months ago
1
Fix extended tasks usage in README
#108
lewtun
closed
3 months ago
4
Deploying evaluation for finetuned model as AWS SM pipeline step
#107
Avistian
closed
3 months ago
2
Add optimum model
#106
echarlaix
opened
3 months ago
0
StarCoder2 3B SFT models give CUDA OOM on IFEval
#105
lewtun
closed
3 months ago
3
Adding TinyBench
#104
clefourrier
closed
3 months ago
2
Fixes input length management for generative evals
#103
clefourrier
closed
3 months ago
0
Add comment about installing langdetect for running ifeval example
#102
dvsrepo
closed
3 months ago
1
Supports extended tasks
#101
clefourrier
closed
3 months ago
0
Collate items in GenerativeTaskDataset by similar EOS token
#100
clefourrier
opened
3 months ago
0
Push details to hub does not work
#99
NathanHB
closed
3 months ago
1
Fix push details to hub
#98
NathanHB
closed
3 months ago
2
To remember for version upgrades
#97
clefourrier
opened
3 months ago
0
Add BBH
#96
clefourrier
closed
3 months ago
0
Adding support for Arabic benchmarks : AlGhafa benchmarking suite
#95
alielfilali01
closed
3 months ago
13
Make it clearer in the README that the leaderboard uses the harness
#94
clefourrier
closed
3 months ago
0
Adding support for Arabic benchmarks : AlGhafa benchmarking suite
#93
alielfilali01
closed
3 months ago
0
Fix parallel data processing bug
#92
clefourrier
closed
3 months ago
2
Problem with mutliple tasks from the same dataset
#91
clefourrier
closed
3 months ago
1
bump git python
#90
NathanHB
closed
4 months ago
0
add license header to src files
#89
NathanHB
closed
3 months ago
3
Add MT-Bench
#88
NathanHB
closed
3 months ago
0
Add the license to all files headers
#87
clefourrier
closed
3 months ago
0
Create LICENSE
#86
clefourrier
closed
4 months ago
1
Change the eos condition for GSM8K
#85
clefourrier
closed
3 months ago
6
Update huggingface-hub for compatibility with datasets 2.18
#84
clefourrier
closed
4 months ago
2
Sets a max length for the MATH task
#83
clefourrier
closed
4 months ago
0
Anomalously small values `gemma-2b-it` on GMS8k
#82
lewtun
closed
3 months ago
4
Tidy up dependency groups
#81
lewtun
closed
4 months ago
0
Large memory usage on MATH
#80
lewtun
closed
4 months ago
3
Add AGIEval
#79
lewtun
closed
3 months ago
2
Fixing rolling loglikelihood management
#78
clefourrier
closed
3 months ago
0
Bump lighteval to dev version 0.3.0
#77
NathanHB
closed
3 months ago
0
Fix unset generation size
#76
clefourrier
closed
4 months ago
0
Add mt-bench
#75
NathanHB
closed
3 months ago
2
Relax sentencepiece version
#74
lewtun
closed
4 months ago
6
Cannot evaluate models on MATH: `TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'`
#73
lewtun
closed
4 months ago
3
Relax lower bound on `transformers` dependency?
#72
lewtun
closed
4 months ago
2
Update ruff
#71
clefourrier
closed
4 months ago
1
Align GPQA zero-shot / few-shot prompts with paper?
#70
lewtun
opened
4 months ago
3
Need to reupload TruthfulQA
#69
clefourrier
opened
4 months ago
1
Anomalously high scores on GPQA
#68
lewtun
closed
4 months ago
4
Fix #66
#67
clefourrier
closed
4 months ago
3
Cannot evaluate chat model on TruthfulQA (`TypeError: can only concatenate str (not "list") to str`)
#66
lewtun
closed
4 months ago
0
Just adding the custom metrics system
#65
clefourrier
closed
4 months ago
1
Fixes wikitext prompts + some patches on tg models
#64
clefourrier
closed
4 months ago
1
Add HumanEval and HumanEval+
#63
lewtun
opened
4 months ago
1
Enable majority voting for GSM8k / MATH
#62
lewtun
closed
2 months ago
2
Add single `mmlu` config for `lighteval` suite
#61
lewtun
opened
4 months ago
1
Previous
Next