This pull request has an initial support for Metal GPUs. It was tested in a M1 MacBook. The tests are passing:
❯ julia runtests.jl
From worker 3: A = Float32[2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 2.3, 2.3]
Test Summary: | Pass Total Time
CPU | 1 1 8.8s
┌ Warning: No CUDA devices available, skipping tests
└ @ Main ~/tmp/DaggerGPU.jl/test/runtests.jl:46
Test Summary: |Time
CUDA | None 0.0s
┌ Warning: No ROCm devices available, skipping tests
└ @ Main ~/tmp/DaggerGPU.jl/test/runtests.jl:70
Test Summary: |Time
ROCm | None 0.0s
Test Summary: | Pass Broken Total Time
Metal | 1 1 2 4.2s
Hi!
This pull request has an initial support for Metal GPUs. It was tested in a M1 MacBook. The tests are passing: