issues
search
zhuzilin
/
ring-flash-attention
Ring attention implementation with flash attention
MIT License
571
stars
45
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Perf on 8*H800
#55
XG-zheng
opened
1 week ago
0
verify causal masking
#54
huseinzol05
opened
3 weeks ago
6
Error when increasing the sequence length
#53
ZetangForward
opened
3 weeks ago
0
Will llama3_flash_attn suffer from the imbalance issue, too?
#52
GeneZC
closed
1 month ago
2
[咨询]请问 varlen 版本的 ring attention 有办法解除每个子序列都必须要被 degree 整除这个限制?
#51
hhaAndroid
closed
1 month ago
2
Support async all_gather for llama3-style Context Parallel(CP)
#50
shawlleyw
closed
1 month ago
4
Benchmark Question
#49
ZhengxuYan
closed
1 month ago
2
Add llama3_flash_attn_varlen
#48
zhuzilin
closed
1 month ago
0
support new flash_attn lse shape
#47
zhuzilin
closed
1 month ago
0
Got error in ZigZagRingFlashAttnVarlenFunc
#46
ThisisBillhe
opened
1 month ago
4
support more params and be compatible with higher flash_attn version
#45
zhuzilin
closed
2 months ago
0
run the code has error
#44
lambda7xx
opened
3 months ago
4
请问 ringattention这段代码中在算什么?
#43
MarsMeng1994
closed
3 months ago
0
Numerical errors in backward
#42
grimulkan
opened
4 months ago
4
mask to zigzag attention
#41
joey00072
closed
4 months ago
1
多机训练速度问题
#40
kakaxi-liu
opened
4 months ago
3
Bugs when using zigzag_ring_flash_attn: RuntimeError: Number of requests do not match number of collectives
#39
WeixuanXiong
opened
4 months ago
0
ring attention实现原理
#38
lhcezx
opened
4 months ago
12
多卡qkv维度问题
#37
kakaxi-liu
closed
4 months ago
2
Does ring-attn not support dropout?
#36
chinapanda
opened
5 months ago
3
[Feature Request] Support `window_size`
#35
zhuzilin
opened
6 months ago
3
improve readability and potential numerical stability of `out` and `lse` in `_update_out_and_lse` by refactoring their computational expressions.
#34
Yuxin-CV
closed
6 months ago
7
remove unnecessary dtype casts in `_update_out_and_lse function`
#33
Yuxin-CV
closed
6 months ago
6
fix random seed in test files for reproducibility
#32
Yuxin-CV
closed
6 months ago
0
是否需要更新全局最大值?
#31
ljliu
closed
6 months ago
2
fix exp overflow when updating lse
#30
microhu
closed
6 months ago
2
stripe_flash_attn_varlen_func
#29
leo6022
closed
6 months ago
1
How is ring attention applied while doing auto regressive decoding (in the stage of decoding tokens one by one)?
#28
dongzhiwen1218
closed
6 months ago
1
[wip] Add stripe_flash_attn_varlen_*
#27
andreaskoepf
closed
6 months ago
1
add deepspeed ulysses attention
#26
feifeibear
closed
7 months ago
0
add flash-attn install in setup.py
#25
feifeibear
opened
7 months ago
1
ring flash attention with BPT
#24
JiaoPL
opened
7 months ago
3
large memory usage
#23
LzhinFdu
opened
7 months ago
5
精度问题
#22
hxdtest
opened
7 months ago
1
flash attention版本
#21
hxdtest
opened
7 months ago
11
Question about updating lse
#20
jaesuny
closed
7 months ago
2
关于tp和分块操作最终聚合的问题
#19
Jayce1kk
closed
7 months ago
1
test on 8*A800
#18
JiaoPL
closed
7 months ago
3
4卡 A100 测试 ring attention 性能不太行呢
#17
wangshankun
closed
8 months ago
1
Is there a ring_flash_attn_func test example?
#16
zhajiahe
closed
6 months ago
4
Is it support casual attention?
#15
foreverpiano
closed
8 months ago
1
Is there a FlashAttnVarlenFunc version?
#14
Jayce1kk
closed
8 months ago
5
Tiny polishment
#13
reyoung
closed
8 months ago
0
Add benchmark for stripe atten
#12
reyoung
closed
8 months ago
0
Fix qkv packed v2
#11
reyoung
closed
8 months ago
0
Fix ring attn v2 bwd
#10
reyoung
closed
8 months ago
0
Wait time instrumentation [not intended to be merged]
#9
andreaskoepf
opened
8 months ago
0
Thanks for your great work and here is my test results!
#8
ChawDoe
closed
6 months ago
8
added requirements.txt
#7
melvinebenezer
closed
8 months ago
2
Stripe Attn
#6
reyoung
closed
8 months ago
0
Next