bigscience-workshop lm-evaluation-harness issues

bigscience-workshop / lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

MIT License

101 stars 30 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

AssertionError

#161 lpc-eol closed 11 months ago
2
How can I use BigBIO (Biomedical Dataset Library) in this repository ?

#160 Davidwhw closed 11 months ago
0
Single reference target

#159 lpq29743 closed 10 months ago
0
Number of fewshot examples

#158 lpq29743 closed 10 months ago
0
Rouge score

#157 Muennighoff opened 1 year ago
1
lm_eval.list_model_apis() not found

#156 robertLiuLinFeng opened 1 year ago
1
Python 3.8 Support

#155 vrunm closed 1 year ago
0
translation evalution error

#154 laozhanghahaha opened 1 year ago
0
add xnli task

#153 hatimbr closed 1 year ago
2
Update `actions/checkout` to fix `pre-commit` install errors

#152 jon-tow closed 1 year ago
0
Fast few shots generation

#151 ncassereau closed 1 year ago
0
Add Amazon Reviews tasks and load_from_disk function

#150 hatimbr closed 1 year ago
0
Seq2Seq: Special tokens are also added to targets for LL computation

#149 samsontmr opened 1 year ago
2
max_length not set correctly

#148 hatimbr opened 1 year ago
1
Unknown issue with loading object

#147 pku-yao-cheng closed 1 year ago
1
cache not storing predictions

#146 rbawden opened 1 year ago
0
Corrected bug in Flores fewshot tasks and added flores task

#145 rbawden closed 1 year ago
1
Re-format CLI help docs for readability

#144 jon-tow closed 2 years ago
0
[feature] Add task constructor arg support to CLI

#143 jon-tow closed 2 years ago
0
How this evaluation is done?

#142 a-cavalcanti closed 2 years ago
1
Add all available doc splits to `flores_101`

#141 jon-tow closed 2 years ago
0
[fix] Skip `doc_id` in few-shot prompt comparison

#140 jon-tow closed 2 years ago
0
Added special few-shot examples for DiaBLa and Flores

#139 rbawden closed 2 years ago
2
[fix] Add `stdout` handler to evaluator logger

#138 jon-tow closed 2 years ago
0
Update deprecated `collections.MutableMapping `

#137 jon-tow closed 2 years ago
0
Remove `text_target_separtor` whitespace from Seq2SeqLM labels

#136 jon-tow closed 2 years ago
0
Space prepended for Seq2Seq

#135 Muennighoff closed 2 years ago
1
Add xnli

#134 gentaiscool opened 2 years ago
9
Add revision to AutoConfig initialization

#133 samsontmr closed 2 years ago
0
BigBio

#132 StellaAthena closed 1 year ago
3
Loosen `add_special_tokens` arg assertion

#131 jon-tow closed 2 years ago
0
Update model documentation and docstring style

#130 jon-tow closed 2 years ago
0
Add `AutoConfig` and `AutoTokenizer` class attributes

#129 jon-tow closed 2 years ago
0
Make `seed` configurable

#128 jon-tow closed 2 years ago
4
update setup.py with new default bigbio branch

#127 galtay closed 2 years ago
2
Add a `add_special_tokens` property to the `HuggingFaceAutoLM` base class

#126 jon-tow closed 2 years ago
0
Add arg name separators to output paths

#125 jon-tow closed 2 years ago
0
Fix stopping criteria to avoid early termination of generation

#124 samsontmr closed 2 years ago
0
Add `device_map_option` to `accelerate` args

#123 jon-tow closed 2 years ago
0
Lazy import `pytest` in utils module

#122 jon-tow closed 2 years ago
0
Hard-code `seed`s to reduce non-deterministic behavior

#121 jon-tow closed 2 years ago
0
WIP: Add Multieurlex

#120 Muennighoff opened 2 years ago
0
different score ranges are confusing

#119 Muennighoff opened 2 years ago
2
Revert option for minimum generation length

#118 Muennighoff closed 2 years ago
2
Update `revision` to handle subfolder specification

#117 jon-tow closed 2 years ago
0
Remove `AutoSeq2SeqLM` prints

#116 jon-tow closed 2 years ago
0
Add min gen len

#115 Muennighoff closed 2 years ago
1
Bloom tested dataset not exist in this repo

#114 switiz closed 2 years ago
1
Truncate seq2seq to `max_length`

#113 jon-tow closed 2 years ago
0
Flip `use_cache` default to store true

#112 jon-tow closed 2 years ago
0