issues
search
hsiehjackson
/
RULER
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Apache License 2.0
319
stars
17
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
128K sequence length means 131072 or 128000
#34
syp1997
opened
3 days ago
1
Qwen2 and DeepSeek-V2 results?
#33
hijkzzz
opened
6 days ago
1
Add SGLang backend
#32
Ying1123
closed
2 days ago
0
Base vs Chat prompt question.
#31
karansaxena
closed
2 weeks ago
3
Prediction format during evals
#30
karansaxena
closed
3 weeks ago
5
pre_sample in qa code
#29
vkaul11
closed
2 weeks ago
1
request for evaluating GLM4-9B-chat(-1M)
#28
yucc-leon
closed
2 weeks ago
2
questions about ICL code for variable tracking
#27
vkaul11
opened
3 weeks ago
1
Is there any issue in extending context length to 1 million using your script
#26
vkaul11
opened
3 weeks ago
1
What is the need for is_icl parameter?
#25
vkaul11
opened
3 weeks ago
1
lost in the middle problem
#24
vkaul11
opened
3 weeks ago
1
how do you take care of the presence of 'and' in the output in the evaluation
#23
vkaul11
opened
4 weeks ago
1
prediction evaluation statistics
#22
vkaul11
opened
4 weeks ago
4
Why do you need to separate the last batch of the output
#21
vkaul11
opened
1 month ago
1
Add answer_predfix to prevent model from refusing to answer typo?
#20
vkaul11
opened
1 month ago
2
what was the reason to use nltk in NIAK task here
#19
vkaul11
closed
1 month ago
3
dataset argument for qa.py not specified
#18
vkaul11
closed
1 month ago
2
Yuzhe
#17
zyzzzz-123
closed
1 month ago
0
Question about files nouns.list and verbs.list
#16
vkaul11
closed
1 month ago
0
Why do you use partial match max metric for QA
#15
vkaul11
closed
1 month ago
1
How to test models with larger context length than 128K ?
#14
yaswanth-iitkgp
opened
1 month ago
10
Tempate for Yi?
#13
liyucheng09
closed
1 month ago
2
gpt-4o results?
#12
the21st
opened
1 month ago
1
No Generated Output and JSON Serialization Error when calling llm directly in VLLMClient
#11
yaswanth-iitkgp
opened
1 month ago
2
Raw scores?
#10
WesleyYue
opened
1 month ago
2
Do we have any ideia how many tokens is used to run the full benchmark in a model?
#9
daniellefranca96
closed
1 month ago
1
Why is multi_key_2 and 3 with only 1 key?
#8
jzhang38
closed
1 month ago
1
Time taken on 8 A100?
#7
jzhang38
closed
1 month ago
2
Llama 3 rope theta
#6
ganler
closed
1 month ago
4
niah.py hang with hf models
#5
hijkzzz
closed
2 months ago
4
How to evaluate the performance of RWKV or Jamba?
#4
hijkzzz
closed
2 months ago
0
Score is always 0.0, and it takes so long to prepare the dataset
#3
YJHMITWEB
closed
2 months ago
1
Show Gemini Pro results
#2
s-macke
closed
2 months ago
2
When will the codes be release
#1
Mooler0410
closed
2 months ago
3