ROCm / MIOpen

AMD's Machine Intelligence Library
https://rocm.docs.amd.com/projects/MIOpen/en/latest/
Other
1.09k stars 231 forks source link

[Fix][TransformTensor] Ignore output buffer when BETA=0 #3184

Closed atamazov closed 3 months ago

atamazov commented 4 months ago

The primitive produces invalid results when BETA=0 and output buffer contains junk (NaNs). This PR fixes the issue.

By-products:

Related issue:


[Attribution] @junliume @JehandadKhan

atamazov commented 3 months ago

@CAHEK7 Maybe later (I do not have perf tests for this primitive on hand). This PR is about correctness and I am quite sure that is doesn't lead to perf degradations (hope you sure too). Also I suspect that the most of time on GPU is spent for address/offset computations, so the expected perf gain is pretty small.