issues
search
ROCm
/
aotriton
Ahead of Time (AOT) Triton Math Library
MIT License
22
stars
7
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[Issues]: The Gap between AOT and JIT Triton on Flash Attention kernel
#34
jinsong-mao
opened
23 hours ago
0
Install .so if AOTRITON_NO_SHARED is OFF
#33
jithunnair-amd
closed
1 week ago
0
[Issue]: failed to run the tune_flash.py
#32
jinsong-mao
opened
1 week ago
13
Add varlen support to AOTriton's Flash Attention
#31
xinyazhang
closed
1 week ago
3
How to run benchmark tests[Issue]:
#30
jinsong-mao
closed
2 weeks ago
8
Refactor the build system
#29
xinyazhang
closed
3 weeks ago
1
[Issue]: Unable to build, Unknown CMake command "pybind11_add_module"
#28
RandUser123sa
opened
1 month ago
1
[Feature]: CDNA1 Support
#27
IMbackK
opened
1 month ago
0
Adding mutex.h for TE pytorch extension compilation
#26
wangye805
closed
1 month ago
0
the atten_bwd_dk_dv is bad in performance on mi300x
#25
jinsong-mao
closed
2 weeks ago
1
[mGPU] Run hipModuleLoadDataEx for each GPU device.
#24
xinyazhang
closed
1 month ago
0
Resolve cmake conflicts when adding aotriton into TE via add_subdirectory
#23
wangye805
closed
1 month ago
0
Add FP32 and Bias to fulfill the functionalities required by `torch.nn.attention.SDPBackend.EFFICIENT_ATTENTION`
#22
xinyazhang
closed
1 month ago
0
[Question] Autotune kernel based on `third_party/triton`
#21
xinji1
closed
2 months ago
2
<vector> is required regardless of AOTRITON_USE_ZSTD
#20
xinyazhang
closed
2 months ago
1
Add new triton kernel debug_fill_dropout_rng
#19
xinyazhang
closed
1 month ago
0
[Issue]: Pytorch fails to compile locally due to aotriton failing to build the hsaco objects
#18
Zakhrov
opened
2 months ago
4
[Feature]: Fix the mandatory boundary_check when loading bias tensor
#17
xinyazhang
opened
2 months ago
0
[Feature]: Memory Efficient Flash Attention for gfx1100 (7900xtx)
#16
supernovae
opened
2 months ago
9
[Feature]: C++ version `mk_aotensor`
#15
xinji1
closed
2 months ago
2
Add matrix bias to forward/backward kernel
#14
xinyazhang
closed
2 months ago
1
Switch Tuning database to SQLite3 for Incremental Tuning
#13
xinyazhang
closed
2 months ago
0
[Documentation]: Shall we modify the configurations in `v2python` for the other kernels?
#12
xinji1
opened
2 months ago
5
[Issue]: Release Tags
#11
trixirt
closed
3 months ago
2
Fix the performance regression introduced during support of irregular shapes.
#10
xinyazhang
closed
3 months ago
0
Update README.md
#9
groenenboomj
closed
3 months ago
1
Add strides to all input tensors
#8
xinyazhang
closed
3 months ago
0
Support irregular shapes for backward
#7
xinyazhang
closed
3 months ago
0
Irregular head dim rebase
#6
groenenboomj
closed
3 months ago
0
Add support for irregular Q and KV sequence lengths to the bwd kernels
#5
vgokhale
closed
4 months ago
0
Add support for irregular Q and KV seqlens to the bwd kernels
#4
vgokhale
closed
4 months ago
0
Get Ready for ORSB Scan
#3
xinyazhang
closed
4 months ago
0
V2 API
#2
xinyazhang
closed
4 months ago
0
Add a dockerfile for a base
#1
groenenboomj
closed
1 week ago
1