issues
search
ROCm
/
composable_kernel
Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
https://rocm.docs.amd.com/projects/composable_kernel/en/latest/
Other
251
stars
102
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[CK_TILE] Lower fmha occupancy if we enable almost all the features
#1370
poyenc
opened
4 hours ago
0
Add instances for grouped conv fwd 3d with ConvScale for bf8@fp8->fp8
#1369
geyyer
opened
2 days ago
0
Fix the optional ckProfiler grouped_gemm arguments.
#1368
illsilin
closed
2 days ago
0
Bump rocm-docs-core from 1.4.0 to 1.4.1 in /docs/sphinx
#1367
dependabot[bot]
closed
3 days ago
0
adding mha as static lib
#1366
bghimireamd
opened
3 days ago
2
Adding a private docker for ROCm6.2 release candidate.
#1365
illsilin
closed
3 days ago
0
Update CMakeLists.txt
#1364
Ruturaj4
closed
3 days ago
0
Add structural sparsity xdlops
#1363
jakpiase
opened
3 days ago
0
Merging the gfx12 code into public repo.
#1362
illsilin
closed
4 days ago
0
add gemm_bias_add example
#1361
zjing14
opened
5 days ago
0
Universal streamk with atomics
#1360
hsadasiv
opened
6 days ago
1
[CK_TILE] Replace hipDeviceSynchronize() by hipStreamSynchronize(stream) calls
#1359
poyenc
closed
4 days ago
0
fix build issues on gfx1100 alone
#1358
junliume
opened
6 days ago
3
[-Werror,-Wunused-parameter] Build issue for gfx1100 alone
#1357
junliume
opened
6 days ago
0
[CK_TILE] wa prec, remove sgpr offset for inline asm
#1356
carlushuang
opened
1 week ago
0
[CK_TILE] Refactor codegen script, support generating multiple APIs for an example
#1355
poyenc
closed
4 days ago
1
Add ckProfiler support for forward 3D convolutions with OUT element-wise operations.
#1354
andriy-ca
opened
1 week ago
0
Initial fp8 update
#1353
geyyer
closed
6 days ago
0
Fix FA bwd alibi+causal NaN errors
#1352
danyao12
closed
1 week ago
0
Remove gfx900 and gfx906 from default target device to reduce package size
#1351
zjing14
closed
1 week ago
0
Fix in dropout lambda to avoid the compiling issue on some docker/compiler
#1350
qianfengz
closed
1 week ago
0
LDS prefetch pipeline support for FlashAttention
#1349
ramjana
opened
1 week ago
0
Adding Missed Activation Functions for Grouped 2D/3D Convolutions
#1348
ThruptiRajLakshmanaGowda
closed
1 week ago
0
Add read_first_lane function for int64
#1347
bartekxk
closed
1 week ago
1
WA for rocm-6.2+ s constrait for buffer resource
#1346
carlushuang
closed
1 week ago
1
Add functional support of AB group scale
#1345
zjing14
opened
1 week ago
1
Hacking ck_tile fmha Dropout facility
#1344
qianfengz
closed
1 week ago
0
[CK_TILE][FA] using pk f16_f32
#1343
carlushuang
closed
2 weeks ago
0
Fix cmake warnings
#1342
bartekxk
closed
1 week ago
4
Universal gemm splitk using reduce (with multi-d)
#1341
ltqin
opened
2 weeks ago
0
[CK Tile] Generic attention masking support for FMHA fwd and bwd
#1340
cameronshinn
opened
2 weeks ago
0
layernorm2d forward
#1339
rocking5566
closed
1 week ago
0
[CK_TILE] fmha forward split-kv + combine kernels
#1338
poyenc
closed
5 days ago
5
Fix to the using of static_for in amd_buffer_addressing.hpp
#1337
qianfengz
closed
2 weeks ago
0
Fix continous dim selection in contraction
#1336
bartekxk
closed
1 week ago
0
Switch to universal gemm in grouped gemm tile loop
#1335
jakpiase
closed
1 week ago
0
cmake issue after #1286: ModuleNotFoundError: No module named 'dataclasses'
#1334
junliume
opened
2 weeks ago
4
Add custom type vector support
#1333
geyyer
opened
2 weeks ago
0
Support large tensors in grouped conv fwd
#1332
bartekxk
closed
2 weeks ago
0
Disabled LDS direct load inline asm
#1331
zjing14
closed
2 weeks ago
0
need to add -Wno-nvcc-compt
#1330
yxsamliu
closed
1 week ago
7
Remove External CI PR trigger
#1329
alexxu-amd
closed
3 days ago
0
Fix nhwgc f16 wmma instances
#1328
bartekxk
closed
2 weeks ago
0
Bump rocm-docs-core from 1.3.0 to 1.4.0 in /docs/sphinx
#1327
dependabot[bot]
closed
3 weeks ago
0
Add instances of grouped convolution 3d forward with a ConvScale element-wise op for bf8@bf8->fp8
#1326
andriy-ca
closed
1 week ago
0
Add instances for grouped conv fwd 3d with ConvScale for fp8@bf8->fp8
#1325
geyyer
closed
2 weeks ago
0
Bump rocm-docs-core from 1.2.1 to 1.3.0 in /docs/sphinx
#1324
dependabot[bot]
closed
3 weeks ago
0
Adding Missed Activation Functions for Grouped 2D/3D Convolutions
#1323
ThruptiRajLakshmanaGowda
closed
1 week ago
0
Bump rocm-docs-core from 1.2.0 to 1.2.1 in /docs/sphinx
#1322
dependabot[bot]
closed
3 weeks ago
0
Disable the hipTensor test in CI by default, only run once daily
#1321
illsilin
closed
3 weeks ago
0
Next