Add GPU support to MRIReco.jl

nHackel commented 3 months ago

This PR adds GPU support to MRIReco.jl and the relevant subpackages. It is based on the following PRs: LinearOperators.jl#321, RegularizedLeastSquares.jl#81 and LinearOperatorCollection.jl#9

Likewise to the other PRs, I will try to keep these changes GPU-agnostic and document my high-level idea and relevant changes in the PR description.

To work on the GPU, we need all LinearOperators to have a GPU storage type. I've started adding the relevant S kwarg to the operators in the collection package and in MRIOperators.jl. Additionally, we need our solver input and dense arrays to also live on the GPU if possible. To allow for this I have added a new :arrayType (name might change) parameter to the parameter Dict, from which on can then construct correct arrays and vectors.

I'll first to keep all changes close to the Reco and operator packages. That in turn means that we need to convert the acquisition data everytime during reconstruction. An alternative could be to specifc the backing array type during data loading and derive it from there

nHackel commented 3 months ago

And here is the first GPU reconstruction done with the changes in commit 8ecdc66. The left column shows the phantom, the middle column the CPU result and the right column the GPU result.

aTrotier commented 3 months ago

Do you know a way to tests the GPU reconstruction during the CI ?

I think @cncastillo is also trying to test KomaMRI with a different CI setup for the GPU part

nHackel commented 3 months ago

There are JLArrays, which are a reference implementation of GPU arrays that run on the CPU. If the code remains GPU agnostic, we can have one test-suite running on "normal" arrays and then again on JLArrays. This is the approach we aim for in RegularizedLeastSquares at the moment.

One issue with that approach is that we need to make all the operators work with JLArrays. In particular, we just call plan_fft and plan_nfft with a given array type and I don't think either work for JLArrays. But I can implement a workaround for that in LinearOperatorCollection. I had to do a similar workaround for the WaveletOp, since Wavelets don't work on the GPU

I also saw this in the Julia #gpu Slack: https://github.com/JuliaGPU/buildkite. As far as I understood it registered Julia package that belong to an organization can apply to run on their CI.

cncastillo commented 2 months ago

I also saw this in the Julia #gpu Slack: JuliaGPU/buildkite. As far as I understood it registered Julia package that belong to an organization can apply to run on their CI.

Hi! Indeed, to use JuliaGPU BuildKite CI, your package should be a part of a Julia organization. We recently moved KomaMRI to JuliaHealth for this, and besides some broken links and a few minor issues (https://github.com/JuliaHealth/KomaMRI.jl/issues/335), it was straightforward.

I think @cncastillo is also trying to test KomaMRI with a different CI setup for the GPU part

We are planning on testing Koma on multiple GPUs anyway, so we will do some tests like the ones you require (here https://github.com/JuliaHealth/KomaMRI.jl/blob/master/test/runtests.jl). As we use MRIReco, we are happy to do the tests for you, but we haven't set up the Builkite CI config yet :smile:. In the meantime, you can create a PR in Koma pointing to the correct version of MRIReco (this PR), adding CUDA, AMD, Metal, etc to the test environment, modify Buildkite's config (pipeline.yml) to run the tests you want. I can make you a collaborator if you need it.

Having said that, Koma's current integration with MRIReco uses RawAquisitionData as an intermediate. The MRD format inspired this custom type, so the data is forced to a specific type (which makes sense for writing to MRD). It would be nice if the GPUArrays could flow freely from the simulations to the recon; I think the main culprits are the traj and data properties of the Profile type; maybe those could be relaxed.

mutable struct Profile
  head::AcquisitionHeader
  traj::Array{Float32,2}            #<- allow AbstractArray?, but then force to F32 when writting
  data::Array{Complex{Float32},2}   #<- allow AbstractArray?, but then force to CF32 when writting
end

Something similar for AcquisitionData.

Cheers!

nHackel commented 2 months ago

Getting access to BuildKite over KomaMRI.jl sounds a like great idea! I'll look into that once everything is setup here. I'll still probably setup the JLArray test case for the local tests.

I saw that LinearOperators.jl had a CI configuration where a maintainer could trigger breakage tests with upstream packages. Something like that would be a good fit for your idea @cncastillo. Then we can test it for the commits where it really matters.

I've also thought about extending my changes to the base, data and file handling packages, but for this PR I'll stick close to a refactoring of the Reco and operator packages with small changes in the other ones if necessary. For now I just want to get all the MRIReco tests and the MRIReco benchmark running on a GPU.

Once we have that running we have a good idea of what structures and information we need to run things on the GPU and then we can look into more PRs that potentially do GPU work or just more generic array compatibility in the simulation and "base" packages

nHackel commented 2 months ago

Closes #168

nHackel commented 2 months ago

Here are some initial results showing the times of benchmark1 from MRIRecoBenchmark:

and the results:

Note that in this plot the GPU has data points for 2, 4 and 8 threads, but it is just the result from 1 thread repeated so I didnt have to adapt the plotting script too much. The GUI reconstruction runs sequentially not parallel.

I'm not yet sure where the performance regression for the toeplitz reco comes from. I'll look into that later

MagneticResonanceImaging / MRIReco.jl

Add GPU support to MRIReco.jl #182