SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
285 stars 29 forks source link

Add AMDGPU support #96

Closed jpsamaroo closed 1 year ago

jpsamaroo commented 3 years ago

With ROCKernels being merged into KernelAbstractions, we should soon be able to integrate AMDGPU support into this package!

@ChrisRackauckas if this is desired, how would you like me to do this? Currently DiffEqGPU depends on CUDA; I could also have it depend on AMDGPU.jl as well, and ensure I keep dependencies up-to-date so we don't run into version conflicts and unnecessary downgrades. Alternatively, I could make AMDGPU (and optionally CUDA) gated behind Requires, although it could possibly impact compile times post-precompile.

ChrisRackauckas commented 3 years ago

Yeah let's add it. Could EnsembleGPUArray automatically detect the user's GPU and choose CUDA vs AMD for them? It'll be interesting to see the performance here since the application is very different from deep learning. I think AMD could have a chance here.

jpsamaroo commented 3 years ago

Could EnsembleGPUArray automatically detect the user's GPU and choose CUDA vs AMD for them?

Yes I think so, I filed https://github.com/JuliaGPU/KernelAbstractions.jl/issues/229 because we have a lot of manual iscuda = ...; if iscuda that could potentially be hidden by KA. If we don't end up doing this, I'll add an equivalent helper function here.

It'll be interesting to see the performance here since the application is very different from deep learning. I think AMD could have a chance here.

I'm willing to bet against AMDGPU performance currently; CUDA.jl has had a long time to optimize kernel launch and execution performance, and AMDGPU.jl hasn't exposed any occupancy API or compiler knobs to users (although I'm certainly interested in figuring this out once we find it's needed). Still, I do think we can pretty quickly get close to CUDA's performance once we start looking at benchmark results.

utkarsh530 commented 1 year ago

Fixed via #241.