pytorch-labs attention-gym issues

pytorch-labs / attention-gym

Helpful tools and examples for working with flex-attention

BSD 3-Clause "New" or "Revised" License

484 stars 24 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Clarification on torch.compile behavior with flex_attention

#35 kebijuelun closed 2 months ago
1
How to avoid re-compute mask

#34 NonvolatileMemory opened 2 months ago
7
Dynamic shape compilation support for flex attention with block mask

#33 SamGalanakis opened 3 months ago
1
Question about OOm on large sequences

#32 foreverpiano closed 2 months ago
8
Support varied input sequence lengths with a fixed block mask

#31 tilmto opened 3 months ago
5
Integration with Hugginface Transformers

#30 buaacyw closed 2 months ago
1
It seems that `visualize_attention_scores` can only visualize either mask-mod-only or score-mod-only

#29 XinDongol opened 3 months ago
2
when using the same mask and same query/key/value.shape, how to fix the kernel instead of recompiling the flexattention?

#28 foreverpiano closed 3 months ago
4
why in the attention-gym, flex-attention runs faster than FA2; however, in real environment, it runs too slower than FA2?

#27 foreverpiano closed 3 months ago
2
`error: 'tt.broadcast' op requires the same encoding for all operands and results` for local window attention

#26 fteufel opened 3 months ago
15
Does FlexAttention Support torch.vmap?

#25 MiladInk opened 3 months ago
3
Use hatch vcs for versioning

#24 drisspg closed 3 months ago
0
Update softcapping.py

#23 Chillee closed 3 months ago
0
[flex_attention] Softcap perf questions

#22 meshtag opened 3 months ago
6
V100 GPUs supported ?

#21 boren-ms opened 3 months ago
5
Bias gradient support?

#20 ardagoreci opened 3 months ago
11
Writing to a globally scoped tensor from score_mod function

#19 jeffwillette opened 3 months ago
1
Paged attention

#18 kme2698 opened 3 months ago
1
Thank you for awesome work! I saw from the blog that paged attention can also be implemented with flex attention.

#17 kme2698 closed 3 months ago
0
NATTEN example

#16 Birch-san closed 3 months ago
0
CUDA OOM with sliding window attention

#15 mishooax closed 3 months ago
6
Shared memory out of resource

#14 TechxGenus opened 3 months ago
2
Fix a typo

#13 sangyeon-k closed 3 months ago
0
Add Dilated Sliding Window mask_mod

#12 sangyeon-k closed 2 days ago
2
2D NATTEN Typo?

#11 zaptrem closed 3 months ago
1
[question] Possible to implement Attention Steering?

#10 GindaChen opened 3 months ago
3
Compatibility with AMD gpus

#9 vinay-swamy opened 3 months ago
1
Add Graphormer mod

#8 stsouko opened 3 months ago
1
Remove gh release

#7 drisspg closed 3 months ago
0
add pypi tag publish flow

#6 drisspg closed 3 months ago
0
Publish on PyPI?

#5 Ryu1845 closed 3 months ago
2
Fix link in README.md

#4 zinccat closed 3 months ago
0
Add Global + Sliding Window

#3 drisspg opened 3 months ago
0
Add Dilated Sliding Window mask_mod

#2 drisspg opened 3 months ago
0
Add Sandwich Score Mod

#1 drisspg opened 3 months ago
0