Omp target offload - Githubissues

GalSim-developers / GalSim

The modular galaxy image simulation toolkit. Documentation:

http://galsim-developers.github.io/GalSim/

Other

224 stars 105 forks source link

Omp target offload #1212

Closed jamesp-epcc closed 1 year ago

jamesp-epcc commented 1 year ago

This is my GPU acceleration work on the sensor model, rebased against the current main branch. Obviously this is a very extensive change, so I'm open to discussing how it's implemented and making changes to better fit the existing code. There is a single unified code base for both GPU and CPU implementations; when offloading is disabled (which can be done via an environment variable or by building with a non-GPU-aware compiler), the offloaded regions act like OpenMP parallel regions, so the loops still take advantage of multiple CPU cores as before. A version of Clang with offloading support is required to build this for GPU.

jamesp-epcc commented 1 year ago

I see the CI is failing. Apparently gcc doesn't like me mapping this via OpenMP. Unfortunately this is needed to build a working GPU version with Clang. I had thought it should be harmless to have the offload directives there even for CPU builds, but it might be necessary to wrap them in #ifdefs so that the compiler only sees them when actually building for GPU.

jamesp-epcc commented 1 year ago

I've wrapped the GPU directives in #ifdefs and the checks now pass. The original code already had similar checks in place for the OpenMP parallel directives so it's not too big a departure from how we previously did things. I still need to investigate whether actually building and running the GPU version in CI is feasible.

jamesp-epcc commented 1 year ago

I have addressed Mike's feedback and rebased against the current main branch.

The function re-ordering, the whitespace changes and the loss of comments were an artefact of me adding GPU versions of the functions initially alongside the originals, then later renaming them and removing the originals, but hopefully all those issues are resolved now.
The values from _abs_length_table are now taken from the table passed in from the Python layer rather than being hardcoded (this was meant to be a temporary change but I forgot about it, apologies).
Const getter methods have been added to PhotonArray instead of const_casting it.
The separate CPU version of updatePixelDistortions has now been removed. By making a few changes I was able to use the GPU-enabled version for all purposes instead.
The pixel distortion update has been moved back out of the update method, as it was originally. addDelta is not being used in update, but I have added a comment explaining why.

jamesp-epcc commented 1 year ago

The CI tests are failing as they are unable to install codecov. However I think this is unrelated to my changes (see here: https://github.com/home-assistant/core/issues/91283 ).

rmjarvis commented 1 year ago

Just remove codecov from the ci script.

We actually haven't been using it for a while, since they now recommend a bash uploader. And they recently removed codecov from pypi, so pip can't find it anymore. Probably to help discourage people from using the obsolete uploader.