issues
search
ROCm
/
triton
Development repository for the Triton language and compiler
MIT License
86
stars
27
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Cherry pick upstream new passes
#390
zhanglx13
closed
9 months ago
1
Fix invalid conversion from f16 to f16
#389
zoranjovanovic-ns
opened
10 months ago
3
remove unnecessary arch names
#388
scxiao
closed
10 months ago
2
Move fa-transV to the new perf-kernels dir
#387
zhanglx13
closed
10 months ago
3
use hw for fp8 type conversion
#386
scxiao
closed
9 months ago
0
[FA] Add FA tutorial with transV
#385
zhanglx13
closed
10 months ago
0
[Test] Disable mma layout for amd hardware
#384
binarman
closed
10 months ago
0
[WIP] Use new PM to execute amdgpu optimization passes
#383
oplavsic
opened
10 months ago
1
Ifu231005
#382
jayfurmanek
closed
10 months ago
0
Apply kernarg optimization to Triton
#381
wangye805
closed
6 months ago
1
E2e varibale update
#380
Cemberk
closed
10 months ago
0
flexible data types in flash attention
#379
scxiao
closed
10 months ago
0
[RemoveLayoutConversions] Remove PatternSharedInfo structure
#378
binarman
closed
10 months ago
0
[RemoveLayoutConversions] Fix reduce failed infer type error
#377
binarman
closed
10 months ago
3
Tweak matmul tutorial on MI2xx GPU
#376
zhanglx13
closed
10 months ago
2
flexible data types in flash attention
#375
scxiao
closed
10 months ago
0
[FRONTEND] Add input dtypes to autotuning key (#2534)
#374
zhanglx13
closed
10 months ago
2
Enable all data types gemm tutorial
#373
oplavsic
closed
8 months ago
1
[WIP] [RemoveLayoutConversions] Fix reduce failed infer type error
#372
binarman
opened
10 months ago
0
Is there a plan to support the buffer_load instruction?
#371
lzxdn
closed
10 months ago
1
set correct arch info for unit test
#370
scxiao
closed
10 months ago
0
Always promote int8 to int32 in commonShflSync
#369
jayfurmanek
closed
10 months ago
1
use different int8 mfma instructions on different GPUs.
#368
scxiao
closed
10 months ago
1
[PYTORCH] tl.reduce error (warp_size=2): couldn't allocate output register for constraint 'v'
#367
jataylo
closed
10 months ago
4
[GEMM] [Tuning] Parameterize mfma type
#366
zhanglx13
closed
11 months ago
0
Low performance of fmha with head_dim=128
#365
minminsun
closed
11 months ago
11
argmax tl.reduce gives `error: 'tt.reduce' op failed to infer returned types`
#364
jataylo
closed
10 months ago
9
Third Party Backend Merge
#363
micmelesse
closed
10 months ago
0
[GEMM] [Tuning] Add `waves_per_eu` to gemm tuning
#362
zhanglx13
closed
11 months ago
0
triton flash atten module generates NAN results
#361
seungrokj
closed
11 months ago
3
Overwrite initial ROCM_LIBRARIES setting
#360
jataylo
closed
11 months ago
0
bfloat16 casting issue in triton_mm - error: LLVM Translation failed for operation: builtin.unrealized_conversion_cast
#359
jataylo
closed
11 months ago
2
add gfx942 for matrix core support
#358
scxiao
closed
11 months ago
0
fp8 type support
#357
scxiao
closed
10 months ago
3
Improve FA fwd kernel with causal=True
#356
zhanglx13
closed
11 months ago
0
[MFMA] FP8 and BF8 support
#355
binarman
closed
10 months ago
5
PyTorch triton branch synchronisation
#354
jataylo
closed
11 months ago
0
Casting loads from bool tensors to tl.int1 causes hangs on ROCm
#353
jataylo
closed
3 months ago
2
[MFMA] Switch between MFMA types
#352
binarman
closed
11 months ago
1
Add licenses to AMD related files
#351
binarman
closed
11 months ago
0
[GEMM] Tuning script v2
#350
zhanglx13
closed
11 months ago
0
Remove redundant fp32->fp16 conversion in FA
#349
oplavsic
closed
11 months ago
0
fix ifu230908 gemm perf regression
#348
scxiao
closed
11 months ago
0
Ifu230908 2
#347
jayfurmanek
closed
11 months ago
0
Add OptimizeEpilogue pass.
#346
oplavsic
closed
10 months ago
1
[Stream] Fixed bug in stream-pipeline for FA
#345
sjw36
closed
11 months ago
0
[GEMM] [Tuning] Merge split_k and non split_k kernels in GEMM tuning script
#344
zhanglx13
closed
11 months ago
1
add tutorial group gemm example
#343
scxiao
closed
11 months ago
4
gemm tuning script to support gemm fp8/f16 mixed input
#342
scxiao
opened
11 months ago
3
[Stream] Fixed bug in stream-pipeliner exposed by FA
#341
sjw36
closed
11 months ago
1
Previous
Next