ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
208 stars 143 forks source link

Default value for long jump positive and negative #1919

Closed AlexBrownAMD closed 3 months ago

AlexBrownAMD commented 3 months ago

I recently changed the long jump function to accept a parameter to optionally reuse temp registers (to reduce peak sgprs). The parameter defaults to None, and then allocates temp registers internally. The previous change did not set a default "None" for the positive/negative versions, which can cause build errors for some kernel combos. This change sets a default to fix the build.