AMReX-Combustion / PeleC

An AMR code for compressible reacting flow simulations
https://amrex-combustion.github.io/PeleC
Other
162 stars 73 forks source link

How to build and run PeleC using GPU? #769

Open EarlFan opened 8 months ago

EarlFan commented 8 months ago

Dear all,

Hi!

I want to build and run PeleC using GPU, however, I am not able to find any tutorial on installations relevant to GPU or the CUDA enviroment. Can anyone provide some tutorial? Any help will be appreciated!

Thanks!

Regards, Fan E

baperry2 commented 8 months ago

It's hard to provide detailed instructions for GPU use as the details can vary from system to system. But if you want to run on a system with Nvidia GPUs using cuda and your system is set up properly, all you should need to do is compile as normal, but with USE_CUDA = TRUE (and USE_MPI = TRUE assuming you also want MPI support) in your GNUmakefile. I'd recommend trying this for the PMF case using the pmf-lidryer-cvode.inp input file.

When running on GPUs, certain simulation input parameters may benefit from being re-optimized for performance. In particular, you may want larger values for amr.blocking_factor and amr.max_grid_size, and you may want to look at different options for cvode.solve_type. Every problem is different so it's usually good to do a little experimentation.

jrood-nrel commented 8 months ago

It's useful to know that certain things need to be done for certain sites, and AMReX has some supported sites here https://github.com/AMReX-Codes/amrex/tree/development/Tools/GNUMake/sites . The machine query logic is here https://github.com/AMReX-Codes/amrex/blob/development/Tools/GNUMake/Make.machines .

EarlFan commented 8 months ago

Dear all,

Thank you for your assistance!

I try to compile PeleC with nvcc on the WSL but have encountered some challenges, particularly with the Sundial package. Currently, I am able to run PeleC on CPUs without issues, but I am eager to explore the capabilities of GPU acceleration.

If it is OK, I would like to keep this issue open to share my future experiences regarding the use of PeleC with GPU computing.

Regards, Fan E

baperry2 commented 8 months ago

Yeah that's fine to leave this issue open and add more detail on any issues you have running on GPUs, which we can then try to address.

SRkumar97 commented 5 months ago

It's hard to provide detailed instructions for GPU use as the details can vary from system to system. But if you want to run on a system with Nvidia GPUs using cuda and your system is set up properly, all you should need to do is compile as normal, but with USE_CUDA = TRUE (and USE_MPI = TRUE assuming you also want MPI support) in your GNUmakefile. I'd recommend trying this for the PMF case using the pmf-lidryer-cvode.inp input file.

When running on GPUs, certain simulation input parameters may benefit from being re-optimized for performance. In particular, you may want larger values for amr.blocking_factor and amr.max_grid_size, and you may want to look at different options for cvode.solve_type. Every problem is different so it's usually good to do a little experimentation.

Hello! I have a doubt to clarify. When I first tested the code in CPU parallel mode, by running the basic PMF testcase I had not set MPI=TRUE in the example.inp file. But still the mpirun -np command worked out, to run the PeleC executable. Did I miss out anything?

jrood-nrel commented 5 months ago

mpirun will run any application with multiple instances, for example try mpirun -np 8 echo "hello".

Without MPI enabled in PeleC it will run the same application with np instances but they won't communicate to solve a single problem. Without MPI enabled mpirun will run the same problem in multiple instances with no benefit of concurrency.

baperry2 commented 5 months ago

Note that when you compile for MPI, you should have USE_MPI = TRUE in your GNUmakefile, and MPI should appear in the name of the PeleC executable that gets generated. No changes are needed in the input files to run with MPI. But if the executable doesn't have MPI in the name, you generated a serial executable and it will run independent instances as mentioned by @jrood-nrel.

SRkumar97 commented 5 months ago

Thanks for your clarifications on this! @jrood-nrel @baperry2 .

RSuryaNarayan commented 3 months ago

I am trying to get Pele to work with GPUs on kestrel and any instructions on the relevant modules to be loaded will be greatly appreciated. So far I've tried using PrgEnv-nvhpc and PrgEnv-nvidia along with openmpi but I keep getting the following error after I compile TPL

/scratch/ramac106/PeleC/Submodules/PelePhysics/Submodules/amrex/Src/Base/AMReX_ccse-mpi.H:14:10: fatal error: mpi.h: No such file or directory
 #include <mpi.h>
          ^~~~~~~
compilation terminated.
baperry2 commented 3 months ago

For Kestrel GPUs, you can use the modules specified here, which should also work for PeleC: https://erf.readthedocs.io/en/latest/GettingStarted.html#kestrel-nrel

Let us know if there are any issues, it's been a bit since I tested PeleC on Kestrel GPUs and they've been periodically reshuffling the modules as they get the GPUs on line

RSuryaNarayan commented 3 months ago

thank you @baperry2. this is really helpful. Will let you know how it goes

RSuryaNarayan commented 3 months ago

I followed the steps outlined ERF's website (with the latest branches of PeleC and Submodules). I seem to run into the following error:

In file included from /scratch/ramac106/PeleC_latest/PeleC/Submodules/PelePhysics/Submodules/amrex/Src/Extern/SUNDIALS/AMReX_SUNMemory.cpp:1:
/scratch/ramac106/PeleC_latest/PeleC/Submodules/PelePhysics/Submodules/amrex/Src/Extern/SUNDIALS/AMReX_Sundials_Core.H:7:10: fatal error: sundials/sundials_config.h: No such file or directory
    7 | #include <sundials/sundials_config.h>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

I did try re-making TPL after loading the modules suggested, but still get this error...

baperry2 commented 3 months ago

Make sure you've done git submodule update --recursive before make TPLrealclean && make TPL, and double check that your the sundials commit you are using is 2abd63bd6.

However, it does seem that there may be another issue here, as when I try it's getting past the step you are seeing but failing to generate the executable after linking.

RSuryaNarayan commented 3 months ago

The following procedure looks to work but fails to produce an executable towards the end (i.e. goes all the way upto AMReX_BuildInfo but the linking looks like an issue for some reason)

  1. load all the modules here: https://erf.readthedocs.io/en/latest/GettingStarted.html#kestrel-nrel:~:text=For%20compiling%20and%20running%20on%20GPUs%2C%20the%20following%20commands%20can%20be%20used%20to%20set%20up%20your%20environment%3A
  2. make TPLrealclean; make TPL USE_CUDA=TRUE
  3. make realclean; make -j COMP=gnu USE_CUDA=TRUE

using MPI+CUDA btw i.e. USE_MPI=TRUE and USE_CUDA=TRUE. COMP=nvhpc results in sundials issues again...

baperry2 commented 3 months ago

As I mentioned, the setup of the GPU partition of Kestrel has been frustratingly unstable. It appears they have again changed things in a way that makes the prior instructions no longer functional.

You should be able to use the following module setup:

module purge;
module load PrgEnv-gnu/8.5.0;
module load cuda/12.3;
module load craype-x86-milan;

And then compile with:

make TPLrealclean; make TPL COMP=gnu USE_CUDA=TRUE USE_MPI=TRUE
make realclean; make -j COMP=gnu USE_CUDA=TRUE USE_MPI=TRUE
jrood-nrel commented 3 months ago

I happened to be looking at this as well and I used this:

git clone --recursive git@github.com:AMReX-Combustion/PeleC.git && cd PeleC/Exec/RegTests/PMF && module purge && module load PrgEnv-gnu/8.5.0 && module load craype-x86-trento && module load cray-libsci && module load cmake && module load cuda && module load cray-mpich/8.1.28 && make realclean && nice make USE_MPI=TRUE USE_CUDA=TRUE COMP=gnu -j24 TPLrealclean && nice make USE_MPI=TRUE USE_CUDA=TRUE COMP=gnu -j24 TPL && nice make USE_MPI=TRUE USE_CUDA=TRUE COMP=gnu -j24
RSuryaNarayan commented 3 months ago

thanks a lot @jrood-nrel @baperry2 I am able to get a linked executable for the PMF case successfully with CUDA. My specific case though still faces the issue. I guess its something to do with the way the PMF functions and data-structures have been defined and will align everything with the way its done in the present case folder