issues
search
OpenLMLab
/
LEval
[ACL'24] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
GNU General Public License v3.0
312
stars
13
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Question about the leaderboard
#16
cizhenshi
closed
6 days ago
3
Understanding evals
#15
karansaxena
closed
3 weeks ago
1
failed reproduce llama3-8b result
#14
chunniunai220ml
opened
3 weeks ago
1
How to Reproduce Results on Llama3-8b?
#13
Ocean-627
closed
1 month ago
3
Where is the open-ened tasks datasets used for GPT-4/GPT-3.5 evalutation
#12
altctrl00
closed
2 months ago
2
Problems with the sci_fi evaluation
#11
sheryc
closed
5 months ago
2
Except GSM100, other datasets are evaluated in 0-shot?
#10
zhimin-z
closed
7 months ago
1
Is GSM100 evaluated using 8-shot or 16-shot?
#9
zhimin-z
closed
7 months ago
1
It seems to be some mistakes on evalution tasks
#8
coo00ookie
closed
7 months ago
5
topic_retrieval_longchat task eval change the pre to True or False??
#7
DavideHe
closed
9 months ago
2
Question on llm_eval
#6
dudesummer
closed
8 months ago
2
Update README.md
#5
tonysy
closed
10 months ago
3
Validation / test split
#4
howard50b
closed
11 months ago
2
questions on table 2
#3
freshbirdDD
closed
9 months ago
3
Update README.md
#2
eltociear
closed
11 months ago
0
There are some minor mistakes in the dataset
#1
wtangdev
closed
11 months ago
1