issues
search
ROCm
/
Tensile
Stretching GPU performance for GEMMs and tensor contractions.
MIT License
218
stars
147
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Replace YAML operations with C libyaml backend
#1937
bstefanuk
closed
4 months ago
1
Prediction model for optimal number of stream-k tiles to run
#1934
AlexBrownAMD
closed
4 months ago
2
Bump rocm-docs-core from 1.2.0 to 1.4.0 in /docs/sphinx
#1933
dependabot[bot]
closed
4 months ago
1
Add Python unit test coverage report
#1932
bstefanuk
closed
4 months ago
0
Refactor logic file discovery
#1931
bstefanuk
closed
4 months ago
0
Update verifyManifest function
#1930
bstefanuk
closed
4 months ago
0
Analytical grid size prediction model for Stream-K
#1929
AlexBrownAMD
closed
4 months ago
6
XCC-based workgroup remapping for stream-k kernels
#1928
AlexBrownAMD
closed
4 months ago
1
adds tPrint and reconciles printing options
#1927
ellosel
closed
4 months ago
2
Add architecture management functions to TensileCreateLibrary
#1926
bstefanuk
closed
4 months ago
6
Update RTD configs
#1925
samjwu
closed
4 months ago
4
Adds TensileCreateLibrary cli reference docs
#1924
ellosel
closed
4 months ago
0
Add profiling CI job
#1923
bstefanuk
closed
4 months ago
2
[Feature]: support for gfx1103
#1922
NeoChen1024
opened
5 months ago
4
Documentation prototype and ReadtheDocs CI
#1921
bstefanuk
closed
5 months ago
0
Move from hipcc to amdclang
#1920
ellosel
closed
4 months ago
0
Default value for long jump positive and negative
#1919
AlexBrownAMD
closed
5 months ago
0
Two-tile algorithm with SK after DP
#1918
AlexBrownAMD
closed
5 months ago
2
CMake cleanup to prevent redundant work in client builds
#1917
ellosel
closed
5 months ago
5
Cherry-pick RDNA1 fix into 6.1 release
#1916
GZGavinZhao
closed
6 months ago
5
Stream-k debug settings
#1915
AlexBrownAMD
closed
6 months ago
0
Update CHANGELOG.md for ROCm 6.2
#1914
babakpst
closed
6 months ago
0
ROCm 6.2 merge staging into develop
#1913
babakpst
closed
6 months ago
0
ROCm 6.2 merge staging into master
#1912
babakpst
closed
6 months ago
0
ROCm 6.2 merge Staging into develop
#1911
babakpst
closed
6 months ago
0
ROCm 6.2 merge staging into master
#1910
babakpst
closed
6 months ago
1
ROCm 6.2 merge staging into master
#1909
babakpst
closed
6 months ago
0
Update changelog for 4.41.0
#1908
babakpst
closed
6 months ago
0
[Feature]: Support for gfx1036
#1907
bitozoid
opened
6 months ago
0
Atomic 2-tile strean-k and tuning parameter clean-up
#1906
AlexBrownAMD
closed
6 months ago
0
Hotfix: Fix MasterSolutionLibrary indexing for multiple architecture build (#1888)
#1905
yenong-amd
closed
6 months ago
2
adding iteration count and rotating flag to the bench config
#1904
babakpst
closed
6 months ago
3
Env variable to test fixed grid size with SK kernels
#1903
AlexBrownAMD
closed
7 months ago
2
Hotfix: Fix WorkspaceCheck implementation when used in rocBLAS
#1902
nakajee
closed
7 months ago
0
Revert "more init code optimizations (#1890)"
#1901
nakajee
closed
7 months ago
2
New dynamic mode
#1900
AlexBrownAMD
closed
7 months ago
0
Fix WorkspaceCheck implementation when used in rocBLAS
#1899
AlexBrownAMD
closed
7 months ago
6
Ignore asm cap check for kernel arg preload for rocm6.0
#1898
nakajee
closed
7 months ago
2
Use fallback libraries for archs without optimized logic (v2)
#1897
GZGavinZhao
closed
7 months ago
6
Fix for workspaceCheck not working issue with GSU
#1896
nakajee
closed
7 months ago
2
Skip stream-k init kernel when possible
#1895
AlexBrownAMD
closed
7 months ago
0
Reject condition change for PreloadKernelArguments
#1894
nakajee
closed
8 months ago
1
Ignore asm cap check for kernel arg preload for old rocm
#1893
nakajee
closed
8 months ago
2
Add reject conditions for SourceKernel + PrefetchGlobalRead/LoopDoWhile
#1892
nakajee
closed
8 months ago
1
Use hipMemcpyAsync for validation
#1891
nakajee
closed
7 months ago
4
more init code optimizations
#1890
nakajee
closed
7 months ago
0
Remove OCL tests
#1889
AlexBrownAMD
closed
8 months ago
1
Fix MasterSolutionLibrary indexing for multiple architecture build
#1888
yenong-amd
closed
8 months ago
7
Disable HostLibraryTests
#1887
AlexBrownAMD
closed
8 months ago
0
support nt flag for global load and store for gfx94x
#1886
nakajee
closed
8 months ago
0
Previous
Next