under-Peter / OMEinsum.jl

One More Einsum for Julia! With runtime order-specification and high-level adjoints for AD
https://under-peter.github.io/OMEinsum.jl/dev/
MIT License
181 stars 23 forks source link

Add support for AMDGPU #165

Closed radudiaconu0 closed 5 months ago

radudiaconu0 commented 6 months ago

Add support for AMDGPU

radudiaconu0 commented 6 months ago

@GiggleLiu can you take a look at tests? the majority of tests that are failing are failing because somehow in this function function _batched_gemm!(C1::Char, C2::Char, alpha, A::ROCArrayTypes{T1,3}, B::ROCArrayTypes{T2,3}, beta, C::ROCArrayTypes{T3,3}) where {T1<:ROCBlasFloat,T2<:ROCBlasFloat,T3<:ROCBlasFloat} AMDGPU.rocBLAS.gemm_strided_batched!(C1, C2, alpha, T1 == T3 ? A : T3.(A), T2 == T3 ? B : T3.(B), beta, C) end

alpha beacomes boolean and it is not allowed in AMDGPU.rocBLAS.gemm_strided_batched!

radudiaconu0 commented 6 months ago

https://github.com/JuliaGPU/AMDGPU.jl/pull/616 this needs to be approved and the only test that fails is the one with gradients

codecov-commenter commented 6 months ago

Codecov Report

Attention: Patch coverage is 0% with 83 lines in your changes are missing coverage. Please review.

Project coverage is 82.54%. Comparing base (8c07022) to head (1634b1b).

:exclamation: Current head 1634b1b differs from pull request most recent head e74d7e9. Consider uploading reports for the commit e74d7e9 to get more accurate results

Files Patch % Lines
ext/AMDGPUExt.jl 0.00% 83 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #165 +/- ## ========================================== - Coverage 88.88% 82.54% -6.35% ========================================== Files 14 15 +1 Lines 1080 1163 +83 ========================================== Hits 960 960 - Misses 120 203 +83 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

GiggleLiu commented 6 months ago

@radudiaconu0 Thank you very much for the amazing work!

The alpha and beta are default to true and false in Julia. You need to cast the types manually to make it consistent with the output type. i.e. you can use the following code instead

AMDGPU.rocBLAS.gemm_strided_batched!(C1, C2, T3(alpha), T1 == T3 ? A : T3.(A), T2 == T3 ? B : T3.(B), T3(beta), C)

I was trying to trying to find a AMD cloud machine to inspect the problem better, however, AWS rejected my request to increase the quota. :(

radudiaconu0 commented 6 months ago

ok i pushed the code @GiggleLiu

radudiaconu0 commented 6 months ago

@GiggleLiu and yeah i found what the problem was yesterday. i forgot to mention. i had to go to CUDAjl source code to find the problem to be honest, There alpha and beta are Number (meaning they can be boolean). so yeah now what needs to be done is that pull request to be approved and solve that gradient test that fails.

amontoison commented 6 months ago

I merged the PR in AMDGPU.jl.

GiggleLiu commented 5 months ago

Looks great! I just merged the PR, thanks again.

radudiaconu0 commented 5 months ago

No problem