ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 144 forks source link

Refactor kernel writing in TensileCreateLibrary #1960

Closed bstefanuk closed 1 month ago

bstefanuk commented 1 month ago

Objectives

The objective of this PR is to refactor the writeKernels function within TensileCreateLibrary into multiple testable and profileable functions. This refactor enhances performance assessment capabilities and ensures future changes maintain or improve current functionality without regressions.

Outcomes

  1. Code blocks in writeKernels are refactored into 6 new functions.
  2. Unit tests are implemented for each new function.
  3. Inline documentation is added for writeKernels and each new function.
  4. Type-hinting is added for the parameter and return types of each function.

Testing

bstefanuk commented 1 month ago

PR closed because it was broken out into smaller components.