openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
764 stars 492 forks source link

GPU port #1304

Open jlsalmon opened 5 years ago

jlsalmon commented 5 years ago

Dear OpenMC team,

As part of my master's thesis, I've been working on a CUDA port of OpenMC, and I thought you guys might be interested in taking a look at some early results. It's pretty rough around the edges, and has a bunch of functionality stripped out, but it basically works. It uses the OptiX API to define geometry as a triangle mesh from a .OBJ file, and uses ray tracing to find nearest surfaces and calculate cell containment. The main transport loop runs on the GPU.

My main aim was to see whether the hardware-accelerated ray tracing in the new NVIDIA Turing GPUs could be used to accelerate scientific applications such as particle transport, and OpenMC looked like a good candidate to test that, especially since I could compare against the DAGMC code path which uses CPU based ray tracing.

I tested with a sphere of radius 4 filled with U235 surrounded by a 5x5x5 cube with a void material, scaling up the number of simulated particles. For the GPU and DAGMC versions, the sphere is an icosphere comprising 82k triangles :

image

In terms of a performance improvement over the CPU versions, it's pretty dramatic, especially on the Turing GPU. Although, it turns out that the hardware-accelerated ray tracing doesn't have much of an impact, since the ray tracing part is a pretty small portion of the overall runtime. I expect that the speedup will diminish as more features are added, but then again, I'm sure there are more optimisations that can be done to squeeze a bit more performance out.

In terms of implementation, there is a rather large amount of code duplication from C++ into CUDA which isn't ideal, but is unfortunately necessary due to the way it needs device code to be structured. No C++ STL either, so a lot of copying of vectors into raw device buffers.

This may well be a throwaway project, but if you guys are interested I can tidy things up and make my branch available so that someone in the future might be able to do a better job.

I'd like to ask you, if there are any specific scenarios or experiments you think might be interesting to carry out while I'm working on this, or anything in particular you'd like to see?

Cheers, Justin

makeclean commented 5 years ago

I for one would love to see the code :) I'm always interested in some GPU stuff

pshriwise commented 5 years ago

Really cool stuff @jlsalmon! Thank you for sharing! I’ve been wondering about just this kind of application in our field for some time and would be interested in continuations of this work.

I'd like to ask you, if there are any specific scenarios or experiments you think might be interesting to carry out while I'm working on this, or anything in particular you'd like to see?

If it would be helpful to you, I’d be happy to provide you with single or multi-volume engineering components to test with higher complexity in terms of geometry and number of triangles. Happy to provide those to you in a .obj format as well (it looks like that was a supported input format anyway)

Again, really interesting work and thanks for posting.

jlsalmon commented 5 years ago

@pshriwise that would be very helpful indeed, thank you! Would you mind shooting them over to justin.salmon.2018@bristol.ac.uk ? At the moment only a single volume is supported, but if I have time I'd like to implement multi-volume as well. Ideally I'd like to get something like the PWR full core model in both OpenMC and OBJ format for a proper comparison (but I realise that might be wishful thinking).

I was also looking for the the h5m files that were used for the DAGMC published work (I asked a question over at the DAGMC repo a while ago). I'm guessing you're probably a good person to ask about that?

@makeclean great, I'll clean up the branch and link to it in the near future :)

nelsonag commented 5 years ago

I'd definitely be interested in seeing what you put together and to get an idea of what is and what is not included!

Thanks @jsalmon!

On Sat, Jul 27, 2019 at 5:12 AM Justin Lewis Salmon < notifications@github.com> wrote:

@pshriwise https://github.com/pshriwise that would be very helpful indeed, thank you! Would you mind shooting them over to justin.salmon.2018@bristol.ac.uk ? At the moment only a single volume is supported, but if I have time I'd like to implement multi-volume as well. Ideally I'd like to get something like the PWR full core model in both OpenMC and OBJ format for a proper comparison (but I realise that might be wishful thinking).

I was also looking for the the h5m files that were used for the DAGMC published work (I asked a question over at the DAGMC repo https://github.com/svalinn/DAGMC/issues/628 a while ago). I'm guessing you're probably a good person to ask about that?

@makeclean https://github.com/makeclean great, I'll clean up the branch and link to it in the near future :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/openmc-dev/openmc/issues/1304?email_source=notifications&email_token=AAH5GM5TDRPBDGGUZABHKDDQBQNPDA5CNFSM4IHEWK72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD26IWHQ#issuecomment-515672862, or mute the thread https://github.com/notifications/unsubscribe-auth/AAH5GMYRVMMTDOIBGO4TUJDQBQNPDANCNFSM4IHEWK7Q .

paulromano commented 5 years ago

@jlsalmon This sounds great! As you can probably gather, many of us are very interested in the GPU space and would love to pore over your code. One of our major projects here at Argonne is developing a capability for coupled Monte Carlo-CFD simulation that will be used on future supercomputers to be delivered at Argonne and Oak Ridge National Laboratory (both of which will have most of their floating point performance on GPUs, not CPUs). As part of this, we will be porting OpenMC to run on Intel GPUs but our hope is that the programming model will be general enough to work on other architectures.

To be clear, in your implementation, is only ray tracing performed on the GPU or are other parts of the code offloaded as well?

mjfountain commented 2 years ago

Has there been any advances in this work please? I'd love to use my GPU for some calcs please. @jlsalmon, your abstract here looks really interesting.

J. Salmon and S. McIntosh-Smith, "Exploiting Hardware-Accelerated Ray Tracing for Monte Carlo Particle Transport with OpenMC," 2019 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), 2019, pp. 19-29, doi: 10.1109/PMBS49563.2019.00008.

gridley commented 2 years ago

Hey @mjfountain, yes, there certainly has been, although it has been in totally separate code branches from what Justin here originally developed. As far as we know, the approach originally employed here is good for problems intensive to ray tracing, but would likely not scale well on problems with a nontrivial number of nuclides.

There is my code branch which is a CUDA-based port you can find in a closed pull request. My code is not protected, and I can share a more recent version with you if you like. Here is the summary of it, although a lot has changed since I wrote this:

Ridley, Gavin, and Benoit Forget. Design and Optimization of GPU Capabilities in OpenMC. ANS Winter Conference. Washington D.C., 2021.

There has also been work that uses an unreleased version of OpenMP. It is not available to the public, and has attained higher performance than my code.

John R. Tramm, Paul K. Romano, Johannes Doerfert, Amanda L. Lund, Patrick C. Shriwise, Andrew Siegel, Gavin Ridley, Andrew Pastrello. Toward Portable GPU Acceleration of the OpenMC Monte Carlo Particle Transport Code. In PHYSOR2022. Pittsburg, PA, 2022.

What type of problem will you be looking at?

mjfountain commented 2 years ago

@gridley thank you. I used to work in nuclear fuel performance and I am preparing for a job interview in reactor physics. I don't have any access to CASMO (or other codes I used) anymore and I wanted to gain some skills. I have an AMD Ryzen 3600 with an RTX 2060 and I wanted to run some of the more complicated simulations on my RTX if it outperforms my Ryzen. I am working through the tutorials and developing the UK EPR from the Generic Design Assessment documentation.

Shihab-Shahriar commented 1 year ago

Hi, I was wondering if there's any update on the GPU port? Thanks.

gridley commented 1 year ago

Hey @Shihab-Shahriar, no. Of course there is plenty of work I have done on this, but it's simply not suitable for widespread distribution at the moment. I recommend looking at github.com/exasmr/openmc.