issues
search
mit-han-lab
/
streaming-llm
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
https://arxiv.org/abs/2309.17453
MIT License
6.59k
stars
361
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Questions about "Run Streaming Llama Chatbot"
#36
ChuanhongLi
closed
11 months ago
3
Can support to codellama34b?
#35
willshion
closed
12 months ago
1
Can support to Qwen14B?
#34
ChenTao98
closed
11 months ago
1
Confused with four attention mechanism and their performance mentioned by paper
#33
weizhenhuan
closed
12 months ago
5
The k_seq_dim and v_seq_dim in StartRecentKVCache look related to the type of model
#32
wangxiaochun520
opened
12 months ago
2
Model paths randomly set
#31
HyperUpscale
closed
12 months ago
1
测试了没有提速哇,咋回事呢?
#30
xxm1668
closed
12 months ago
3
can support to Baichuan2?
#29
luzhongqiu
opened
1 year ago
0
有木有类似chatgpt的调用接口?
#28
xxm1668
closed
12 months ago
1
How to generate longer token streams?
#27
GenTxt
opened
1 year ago
3
b979594a04f1bbefe1ff21eb8affacef2a186d25
#26
ghost
closed
12 months ago
0
Strim
#25
ghost
closed
12 months ago
0
Comparison with SWA in Mistral
#24
casper-hansen
opened
1 year ago
12
output
#23
21pl
closed
1 year ago
0
wrong
#22
QingChengLineOne
closed
12 months ago
3
add suport codellama
#21
willshion
closed
12 months ago
1
Streaming example: Move input_ids to model device rather than "cuda"
#20
tomaarsen
closed
1 year ago
1
hi
#19
Kompiuter89
closed
12 months ago
0
Metal Support
#18
jordo1138
closed
1 year ago
7
I keep getting a 403 forbidden
#17
odfhgodhfighdf
closed
1 year ago
0
Update mt_bench.jsonl
#16
t562
closed
1 year ago
0
[Feature Request] Release StreamEval dataset and evaluation code in OpenCompass
#15
vansin
opened
1 year ago
2
TypeError: llama_pos_shift_attention_forward() got an unexpected keyword argument 'padding_mask'
#14
MartinKratochvilProgramy
closed
1 year ago
4
Have you run any passkey retrieval tests on streaming-llm?
#13
RonanKMcGovern
opened
1 year ago
2
Questions on "streaming-llm" Paper
#12
llsj14
closed
1 year ago
2
'CUDA_VISIBLE_DEVICES' is not recognized as an internal or external command, operable program or batch file.
#11
IntrovertsBedroom
closed
1 year ago
1
Convert demo video from MOV to MP4
#10
cosmojg
closed
1 year ago
0
The video included in the README does not play in Firefox
#9
cosmojg
closed
1 year ago
0
Google Colab installation
#8
narita63755930
closed
11 months ago
10
window_size attention pretrain
#7
wawpaopao
closed
12 months ago
3
Support for MPT
#6
DungNasSa10
closed
1 year ago
1
Python Module as a drop-in replacement for `transformers` using Attention Sinks
#5
tomaarsen
opened
1 year ago
1
Added requirements.txt with pinned package versions
#4
KarimJedda
opened
1 year ago
1
add sentencepiece to README quickstart
#3
r2d4
closed
1 year ago
2
How do you feed long texts to a model?
#2
CorentinvdBdO
closed
1 year ago
3
For LLMs already trained with window attention and BOS token
#1
GeneZC
closed
12 months ago
6
Previous