nvtransfer RULER issues

nvtransfer / RULER

This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?

Apache License 2.0

646 stars 43 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Why do you need to separate the last batch of the output

#21 vkaul11 closed 3 months ago
1
Add answer_predfix to prevent model from refusing to answer typo?

#20 vkaul11 closed 3 months ago
2
what was the reason to use nltk in NIAK task here

#19 vkaul11 closed 4 months ago
3
dataset argument for qa.py not specified

#18 vkaul11 closed 4 months ago
2
Yuzhe

#17 zyzzzz-123 closed 4 months ago
0
Question about files nouns.list and verbs.list

#16 vkaul11 closed 4 months ago
0
Why do you use partial match max metric for QA

#15 vkaul11 closed 4 months ago
1
How to test models with larger context length than 128K ?

#14 yaswanth-iitkgp opened 5 months ago
10
Tempate for Yi?

#13 liyucheng09 closed 5 months ago
2
gpt-4o results?

#12 the21st opened 5 months ago
3
No Generated Output and JSON Serialization Error when calling llm directly in VLLMClient

#11 yaswanth-iitkgp closed 3 months ago
2
Raw scores?

#10 WesleyYue opened 5 months ago
3
Do we have any ideia how many tokens is used to run the full benchmark in a model?

#9 daniellefranca96 closed 5 months ago
1
Why is multi_key_2 and 3 with only 1 key?

#8 jzhang38 closed 5 months ago
1
Time taken on 8 A100?

#7 jzhang38 closed 5 months ago
3
Llama 3 rope theta

#6 ganler closed 5 months ago
4
niah.py hang with hf models

#5 hijkzzz closed 5 months ago
4
How to evaluate the performance of RWKV or Jamba?

#4 hijkzzz closed 5 months ago
1
Score is always 0.0, and it takes so long to prepare the dataset

#3 YJHMITWEB closed 5 months ago
1
Show Gemini Pro results

#2 s-macke closed 6 months ago
2
When will the codes be release

#1 Mooler0410 closed 6 months ago
3