issues
search
OpenBMB
/
InfiniteBench
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
MIT License
244
stars
19
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add support for Local/More APIs
#24
rmusser01
closed
4 days ago
1
Fix kv retrieval score
#23
Wangmerlyn
closed
2 weeks ago
0
fix code_debug task score computing
#22
Wangmerlyn
closed
3 weeks ago
0
Mismatch for longbook_qa_eng
#21
xuandif-cmu
opened
3 weeks ago
1
fix task naming error of En.Sum
#20
Wangmerlyn
closed
3 weeks ago
0
Error in loading from Huggingface
#19
BenHamm
opened
1 month ago
3
bug in computing scores for longdialogue_qa_eng
#18
Xianchao-Wu
opened
1 month ago
1
GPT-4o
#17
karansaxena
opened
2 months ago
1
Bug in Math.Calc
#16
hansjohn
closed
3 months ago
1
Generating Math and Code sample
#15
kai-wen-yang
closed
3 months ago
2
How to evaluate the performance of RWKV or Jamba?
#14
hijkzzz
closed
4 months ago
0
Can I customize the data set length? For example, test 32k, 64k and 200k respectively
#13
hijkzzz
closed
5 months ago
1
Why some data in longbook_qa_eng were modified?
#12
FranxYao
closed
5 months ago
1
name 'ROUGE_SCORER' is not defined
#11
ustccyf
closed
8 months ago
1
The inconsistence between in-context examples and test cases on mathcalc
#10
philipwangOvO
closed
8 months ago
7
数据全是手动标注的吗?有没有存在模型生成的部分
#9
Patrick-Ni
closed
8 months ago
1
About the data source
#8
guanzhchen
closed
9 months ago
2
计算测试分数时报错
#7
iMountTai
closed
9 months ago
5
模型支持长度与测试长度
#6
iMountTai
closed
9 months ago
7
Inference time of YaRN-Mistral-7B
#5
ccclyu
closed
9 months ago
1
Did you use gpt-4-32k for the evaluation?
#4
z379035389
closed
9 months ago
1
Yi-200K?
#3
yhyu13
closed
9 months ago
1
fix bug
#2
tuantuanzhang
closed
9 months ago
0
KeyError when running `eval_yarn_mistral.py` on PassKey
#1
siqi13579
closed
9 months ago
1