issues
search
ROCm
/
hipBLASLt
hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
https://rocm.docs.amd.com/projects/hipBLASLt/en/latest/index.html
MIT License
40
stars
56
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Using ordered list reversed to instead dict reserved.
#881
vin-huang
opened
13 minutes ago
0
Tune aquavanjaram942 SGEMM NN to get peak performance for CU80
#880
aferoz21
opened
2 days ago
2
Update all BBS Gridsbase
#879
Jinp800125
closed
2 days ago
0
Bump rocm-docs-core from 1.4.0 to 1.4.1 in /docs/sphinx
#878
dependabot[bot]
opened
2 days ago
0
Update gfx942 TN F8BS and F8HS yamls
#877
AndySu12
closed
2 days ago
0
Add MAC F32 Instruction to Support Navi
#876
wenchuanchen
opened
3 days ago
1
enable LSU + Int8
#875
nakajee
opened
3 days ago
3
Fix stride, ld overflow in ext API
#874
KKyang
opened
3 days ago
0
Add HIP_CHECK_EXEC to hipMemcpyAsync
#873
KKyang
closed
3 days ago
0
Port negative WGM from Tensile
#872
KKyang
opened
3 days ago
1
Tune aquavanjaram942 I8I8I TN memory bound GEMM sizes for CU80
#871
aferoz21
opened
3 days ago
5
Fix stride, ld overflow in ext API
#870
KKyang
closed
3 days ago
0
Reorder global load instructions of dtva custom kernel
#869
briannwu
closed
2 days ago
2
New rotate method
#868
KKyang
opened
5 days ago
0
use alpha, scaleA and scaleB to do case2 use case
#867
TonyYHsieh
opened
5 days ago
0
gfx942 BBS/HHS TN GridBased tuning
#866
nakajee
closed
1 day ago
3
Add allclose funtion to bench
#865
ssuyuanchang
closed
4 days ago
0
fix incorrect calcLdsNumBytes calculation
#864
aazz44ss
closed
4 days ago
6
remove scaleB rcp hack
#863
jichangjichang
opened
5 days ago
0
Fix of dedicated GridBased kernel index
#862
AndySu12
closed
6 days ago
0
Fix m_rotatingLargestUnitSize formula to match allocNewGPUInputs
#861
KKyang
closed
4 days ago
0
Add API hipblasLtIsDeviceSupported() to query if current device is supported
#860
jichangjichang
opened
1 week ago
5
gfx942 80cu BBS/HHS TN GridBased additional tuning
#859
nakajee
closed
5 days ago
1
gfx942 80cu Add custom kernel with triple buffer algorithm
#858
msujon-AMD
closed
4 days ago
0
Test: Set TENSILE_USE_LLVM OFF in tensilelite Client Executable
#857
samjwu
opened
1 week ago
0
[Issue]: Could not load /opt/rocm-6.1.3/lib/rocblas/library/TensileLibrary.dat
#856
unclemusclez
opened
1 week ago
0
Update sixe Msize of small and large grid
#855
Jinp800125
closed
1 week ago
0
[Issue]: BrokenPipeError: [Errno 32] Broken pipe
#854
unclemusclez
closed
1 week ago
1
Predicate arithmetic
#853
adityalj
opened
1 week ago
1
new mix example for matmul which support amax function.
#852
TonyYHsieh
closed
1 week ago
0
Fix errors when calculating ldsPad and ldsNumBytes of sparse metadata.
#851
vin-huang
closed
1 week ago
1
Fix the condition of adding the missing sizes to nonRotatingSize
#850
KKyang
closed
1 week ago
0
update documentation for matmulIsAlgoSupported()
#849
jichangjichang
closed
1 week ago
0
build with address-sanitizer option seens failing with combination of cc compiler and shared-libasan
#848
jdgh000
opened
1 week ago
0
fix divide zero error
#847
aazz44ss
closed
1 week ago
0
Set inputs in GemmInputs to pointers to const
#846
KKyang
closed
1 week ago
0
Add DTVA custom kernel
#845
briannwu
closed
1 week ago
0
Bump urllib3 from 2.2.1 to 2.2.2 in /docs/sphinx
#844
dependabot[bot]
closed
1 week ago
0
Update optimal value
#843
aazz44ss
closed
1 week ago
0
gfx942 80cu BBS/HHS NN/NT/TN Equality tuning
#842
nakajee
closed
1 week ago
1
[Tensilelite] Enable I8/B/S for Sparse MM
#841
vin-huang
closed
1 week ago
1
Fix missing typeid symbol errror when using clang++
#840
KKyang
opened
1 week ago
1
Don't Merge: Stable case2 inverse workgroup processing
#839
TonyYHsieh
opened
1 week ago
0
gtest fix for int8_matmul
#838
AndySu12
closed
1 week ago
2
Fix removing synchronizer from rotatingSize
#837
KKyang
closed
1 week ago
0
Fix: workspace size of preference descriptor is set as 0 unexpectedly
#836
jichangjichang
closed
5 days ago
0
gfx942 BBS/HHS TN GridBased tuning
#835
nakajee
closed
5 days ago
3
gfx942 80cu BBS/HHS TN GridBased more tuning
#834
nakajee
closed
1 week ago
1
Fix missing typeid symbol errror when using clang++
#833
KKyang
closed
1 week ago
0
Inputs in `GemmInputs` should be pointers to `const`
#832
atamazov
closed
1 week ago
2
Next