ROCm / MIOpen

AMD's Machine Intelligence Library
https://rocm.docs.amd.com/projects/MIOpen/en/latest/
Other
1.09k stars 230 forks source link

Initial porting of Op3dTensorGeneric kernel from OCL ot HIP #3324

Closed novakovicdj closed 1 month ago

novakovicdj commented 1 month ago

This is PR for initial rewriting of Op3dTensorGeneric kernel from OCL to HIP.

Below are graphs of performance comparison between OCL and HIP version (~52000 test cases)

1c1 1ch 11h 111 n1h n11 nc1 nch