ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 142 forks source link

Fix comments on scalarStaticDivideAndRemainder #1956

Closed AlexBrownAMD closed 1 month ago

AlexBrownAMD commented 1 month ago

Related to previous change in PR #1928 . scalarStaticDivideAndRemainder requires 2 temp registers, previous change incorrectly updated doc comments. Revert comments and add assert to ensure temp register is passed when needed.