issues
search
thunlp
/
InfLLM
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
MIT License
306
stars
29
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Error when reproducing mistral results
#55
yuanyehome
opened
2 weeks ago
2
Block-Level Memory Units representative tokens
#54
FranciscoPark
closed
3 weeks ago
0
请问该算法可以支持训练么?
#53
chaiyixuan
opened
1 month ago
0
代码context_manager.py上的问题
#52
YL-9
opened
2 months ago
3
Llama 3.1 Load Fail
#51
DefinitlyEvil
opened
2 months ago
4
Implementation of Streaming-llm
#50
Lu-kuan-lpk
closed
3 weeks ago
1
How to use infLlm as a RAG alternative
#49
liormessinger
opened
3 months ago
0
Clarification Needed: Why is topk=2 Set for Greedy Decoding?
#48
Becomebright
opened
3 months ago
0
关于llama3和mistral config的设置
#47
ehuaa
opened
3 months ago
0
running with infllm-12k.yaml meets errors
#46
JulietLJY
opened
4 months ago
1
Representative Score计算与Memory Lookup的实现细节?
#45
Becomebright
closed
4 months ago
4
接入一个新的模型需要满足哪些条件
#44
klxqlehua
opened
5 months ago
2
ZERO Score when using Origin settings
#43
mazeyang
opened
5 months ago
0
Support CohereForAI/c4ai-command-r-v01
#42
flaviusburca
closed
5 months ago
0
可以与vllm集成吗?
#41
zhangxii
opened
6 months ago
0
`Position Emb` and `Chunk size`
#40
liyucheng09
closed
6 months ago
0
关于longbench测试问题。
#39
tiaotiaosong
closed
6 months ago
1
Qwen1.5-72B-chat-AWQ with longbench and infinibench benchmark OOM with A100 80G
#38
ehuaa
opened
6 months ago
0
ValueError: Only supports llama, mistral and qwen2 models.
#37
thistleknot
opened
7 months ago
0
IndexErrors when attempting to run triton flashattention
#36
adnanoomerjee
opened
7 months ago
1
longbench
#35
Michelleable
closed
7 months ago
0
是否支持多batch推理?
#34
ChuanhongLi
closed
7 months ago
5
实际测试效果较差
#33
sdw12138
opened
7 months ago
4
是否支持Qwen1.5-7B的量化版本?
#32
huliangbing
closed
6 months ago
11
Needle In A Haystack test?
#31
geekboood
opened
7 months ago
1
How to use w transformers?
#30
thistleknot
opened
7 months ago
4
Code licence
#29
kristaller486
closed
7 months ago
2
多卡支持问题
#28
ChuanhongLi
closed
7 months ago
2
请问如何debug下每次中间是哪些token位置被选中进行拼接的?
#27
yinochaos
closed
7 months ago
1
是否可以添加faiss向量库的保存和加载
#26
Minami-su
closed
7 months ago
3
Add Faiss, perhead and better compatibility.
#25
guyan364
closed
7 months ago
0
代码实现的疑问
#24
xjwhy
closed
7 months ago
3
OOM issue
#23
microhu
closed
7 months ago
1
Implementation of `Stream` and `Infinite`?
#22
liyucheng09
closed
7 months ago
1
Qwen1.5-7B-Chat CUDA error: out of memory
#21
yinochaos
closed
7 months ago
6
Qwen1.5-7B-Chat
#20
ChuanhongLi
closed
7 months ago
10
cuda error
#19
Michelleable
closed
7 months ago
7
我看代码会不定期更新,做了什么改动可以加到readme里吗
#18
cat-sun
closed
7 months ago
1
你好,有没有尝试过只在某些层进行lookup,这样就可以减少缓存KV的数量
#17
MrJiangZhongZheng
closed
7 months ago
2
Qwen1.5-72B-Chat-GPTQ-Int4
#16
ChuanhongLi
closed
7 months ago
2
能否将所有的kv缓存存到faiss向量数据库里
#15
Minami-su
closed
7 months ago
12
内存占用会不会很大
#14
MrJiangZhongZheng
closed
8 months ago
3
Optim memory usage, fastchat integration and multiprocessing benchmark
#13
guyan364
closed
8 months ago
0
GPU memory usage at benchmark
#12
Minami-su
closed
8 months ago
3
显存占用问题
#11
cat-sun
closed
8 months ago
1
Refactor, add description and qwen support
#10
guyan364
closed
8 months ago
0
Request for parameter comment
#9
sunying2018
closed
8 months ago
1
能否支持多卡推理
#8
cat-sun
closed
8 months ago
1
OutOfResources: out of resource: shared memory, Required: 151680, Hardware limit: 101376.
#7
Minami-su
closed
8 months ago
2
More generate parameters
#6
Minami-su
closed
8 months ago
0
Next