thunlp InfLLM issues - Githubissues

thunlp / InfLLM

The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"

MIT License

306 stars 29 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Error when reproducing mistral results

#55 yuanyehome opened 2 weeks ago
2
Block-Level Memory Units representative tokens

#54 FranciscoPark closed 3 weeks ago
0
请问该算法可以支持训练么？

#53 chaiyixuan opened 1 month ago
0
代码context_manager.py上的问题

#52 YL-9 opened 2 months ago
3
Llama 3.1 Load Fail

#51 DefinitlyEvil opened 2 months ago
4
Implementation of Streaming-llm

#50 Lu-kuan-lpk closed 3 weeks ago
1
How to use infLlm as a RAG alternative

#49 liormessinger opened 3 months ago
0
Clarification Needed: Why is topk=2 Set for Greedy Decoding?

#48 Becomebright opened 3 months ago
0
关于llama3和mistral config的设置

#47 ehuaa opened 3 months ago
0
running with infllm-12k.yaml meets errors

#46 JulietLJY opened 4 months ago
1
Representative Score计算与Memory Lookup的实现细节？

#45 Becomebright closed 4 months ago
4
接入一个新的模型需要满足哪些条件

#44 klxqlehua opened 5 months ago
2
ZERO Score when using Origin settings

#43 mazeyang opened 5 months ago
0
Support CohereForAI/c4ai-command-r-v01

#42 flaviusburca closed 5 months ago
0
可以与vllm集成吗？

#41 zhangxii opened 6 months ago
0
`Position Emb` and `Chunk size`

#40 liyucheng09 closed 6 months ago
0
关于longbench测试问题。

#39 tiaotiaosong closed 6 months ago
1
Qwen1.5-72B-chat-AWQ with longbench and infinibench benchmark OOM with A100 80G

#38 ehuaa opened 6 months ago
0
ValueError: Only supports llama, mistral and qwen2 models.

#37 thistleknot opened 7 months ago
0
IndexErrors when attempting to run triton flashattention

#36 adnanoomerjee opened 7 months ago
1
longbench

#35 Michelleable closed 7 months ago
0
是否支持多batch推理？

#34 ChuanhongLi closed 7 months ago
5
实际测试效果较差

#33 sdw12138 opened 7 months ago
4
是否支持Qwen1.5-7B的量化版本？

#32 huliangbing closed 6 months ago
11
Needle In A Haystack test?

#31 geekboood opened 7 months ago
1
How to use w transformers?

#30 thistleknot opened 7 months ago
4
Code licence

#29 kristaller486 closed 7 months ago
2
多卡支持问题

#28 ChuanhongLi closed 7 months ago
2
请问如何debug下每次中间是哪些token位置被选中进行拼接的？

#27 yinochaos closed 7 months ago
1
是否可以添加faiss向量库的保存和加载

#26 Minami-su closed 7 months ago
3
Add Faiss, perhead and better compatibility.

#25 guyan364 closed 7 months ago
0
代码实现的疑问

#24 xjwhy closed 7 months ago
3
OOM issue

#23 microhu closed 7 months ago
1
Implementation of `Stream` and `Infinite`?

#22 liyucheng09 closed 7 months ago
1
Qwen1.5-7B-Chat CUDA error: out of memory

#21 yinochaos closed 7 months ago
6
Qwen1.5-7B-Chat

#20 ChuanhongLi closed 7 months ago
10
cuda error

#19 Michelleable closed 7 months ago
7
我看代码会不定期更新，做了什么改动可以加到readme里吗

#18 cat-sun closed 7 months ago
1
你好，有没有尝试过只在某些层进行lookup，这样就可以减少缓存KV的数量

#17 MrJiangZhongZheng closed 7 months ago
2
Qwen1.5-72B-Chat-GPTQ-Int4

#16 ChuanhongLi closed 7 months ago
2
能否将所有的kv缓存存到faiss向量数据库里

#15 Minami-su closed 7 months ago
12
内存占用会不会很大

#14 MrJiangZhongZheng closed 8 months ago
3
Optim memory usage, fastchat integration and multiprocessing benchmark

#13 guyan364 closed 8 months ago
0
GPU memory usage at benchmark

#12 Minami-su closed 8 months ago
3
显存占用问题

#11 cat-sun closed 8 months ago
1
Refactor, add description and qwen support

#10 guyan364 closed 8 months ago
0
Request for parameter comment

#9 sunying2018 closed 8 months ago
1
能否支持多卡推理

#8 cat-sun closed 8 months ago
1
OutOfResources: out of resource: shared memory, Required: 151680, Hardware limit: 101376.

#7 Minami-su closed 8 months ago
2
More generate parameters

#6 Minami-su closed 8 months ago
0