issues
search
THUDM
/
LongBench
[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
MIT License
633
stars
45
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
长文关键信息位置
#79
ruoyuxie
opened
4 days ago
0
Long context dataset
#78
nzw0301
closed
5 days ago
0
Why there is no need special token for chatglm3 when counting the tokens?
#77
condy0919
closed
4 weeks ago
2
Fix Grammar Error in NarrativeQA Prompt
#76
sidjha1
opened
1 month ago
0
inference with kv cache
#75
mohammadh-cerebras
closed
1 month ago
0
可以测试基于OpenAI接口的模型管理框架吗,比如ollama, xinference
#74
jiusi9
opened
1 month ago
1
No initialization for the process group
#73
Mugariya
opened
1 month ago
1
Some questions on the processed dataset in LongBench
#72
jiqimaoke
closed
1 month ago
1
How to evaluate on llama3-8b-instruct?
#71
txchen-USTC
opened
1 month ago
0
关于提升数据集测试有效性的建议
#70
wsn555
opened
2 months ago
7
Code for evaluation with GPT-3.5?
#69
RuskinManku
opened
2 months ago
3
Load dataset from hf failed
#68
murphypei
opened
2 months ago
4
The "anwser" for some examples in "qasper.jsonl" is strange
#67
Zcchill
opened
3 months ago
6
Llama2-7B-chat-4k测试出来结果不一样
#66
PengWenChen
closed
3 months ago
2
Loading local datasets with split=‘test’
#65
yichen0104
opened
4 months ago
1
Chinese Examples in MultiFieldQA-en
#64
wendywangwwt
opened
5 months ago
1
请问数据集中 avg length 是单词长度/字长度还是token个数?
#63
deepindeed2022
closed
4 months ago
1
Table reproduce
#62
hzw20200301
closed
5 months ago
0
`Llama2-7B-chat-4k` on `PassageRetrieval-zh` gets `10.12`
#61
fuqichen1998
opened
6 months ago
5
Include data on which passage contains answer
#60
danielmisrael
opened
7 months ago
1
chatglm3-6b-32k的中文测试结果远远低于README里的benchmark
#59
Strivin0311
closed
7 months ago
5
RuntimeError when running pred.py for Vicuna-v1.5-7B-16k
#58
fuqichen1998
closed
7 months ago
2
求问 Spearman correlation 是怎么计算的
#57
randomtutu
opened
7 months ago
1
CUDA error??????
#56
xvolcano02
closed
7 months ago
2
Llama2-7B-chat-4k测试出来结果不一样
#55
slatter666
closed
7 months ago
3
Any Implementation of Mistral-7B?
#54
leeyeehoo
opened
7 months ago
1
AttributeError: 'str' object has no attribute 'to'
#53
vincent507cpu
closed
7 months ago
1
报错TypeError: Couldn't cast array of type list<item: string> to null
#52
xxcoco763
opened
7 months ago
1
Update retrieval/
#51
FaustLyu
closed
8 months ago
0
Disable grad to avoid OOM
#50
acherstyx
closed
8 months ago
0
测试13b,比如百川,1*A100(80G)会OOM
#49
lvjianxin
opened
8 months ago
0
Evaluate on long context (32k,64k etc..) on 30B/70B large models
#48
CaesarWWK
opened
9 months ago
5
如何评测GPT-3.5或GPT-4
#47
jing-my
closed
7 months ago
3
长度外推的三种方式得到的answer竟一模一样?
#46
IT-five
closed
10 months ago
0
OOM
#45
IT-five
closed
9 months ago
3
单卡A100无法推理
#44
Huwei-deeplearning
closed
9 months ago
3
单张A100 40G 无法运行(OOM) llama2-7b-chat-4k,但是可以运行 chatglm2-6b-32k
#43
fishiu
closed
10 months ago
4
how to apply to baichuan?
#42
IT-five
closed
7 months ago
1
关于评测的合理性
#41
rayleoyoung
closed
10 months ago
2
Kimi-Chat 测试
#40
kunpeng199494
closed
7 months ago
1
Update support chatglm3
#39
JackKuo666
closed
10 months ago
1
关于被测试的模型
#38
pengcheng-yan
closed
10 months ago
2
使用chatglm3-6b-32k 无法复现repo dureader的结果
#37
siqi13579
closed
10 months ago
4
classification_score计算得分代码有误
#36
zhangleiedu
closed
10 months ago
1
pred.py中的typo
#35
ignorejjj
closed
10 months ago
1
Add support for Ollama, Palm, Claude-2, Cohere, Replicate, Llama2 CodeLlama (100+LLMs) [LiteLLM]
#34
ishaan-jaff
closed
9 months ago
2
Add dataset file(retrieval)
#33
FaustLyu
closed
11 months ago
0
KeyError: 'retrieved'
#32
liujingcs
closed
9 months ago
3
chatglm3这个效果说没有在微调的时候灌数据我是不信的→_→
#31
hxs91
closed
11 months ago
0
Is it necessary to add build_prompt to the tokenizer of chatglm3-6b-32k in pred.py?
#30
MrYxJ
closed
11 months ago
3
Next