Open budjensen opened 6 months ago
I just rebuilt and modified the CMake configuration file (through the command ccmake build
) by setting WarpX_PYTHON
to ON. The test ./run_test.sh LaserAcceleration_1d_fluid
is now passing. The output is attached here:
warpx_test_rebuild.txt.
I am still getting the same error that showed up in my original post on the discussion board, where MPI initialized with 1 MPI processes
does not increase past 1 process even as more tasks are requested in my slurm job script.
In talking with Stellar Admins, I learned that (as long as I only run on one node) I can change the last line in my batch script to use mpirun instead of srun and mpi will be initialized properly.
To get it to scale beyond one node, I will need to use an mpi installed out the cluster. Here is a note from a admin:
WarpX instructions tell you to install mpich and mpi4py - which will not work with our slurm setup. You can try setting things up without installing mpich and mpi4py and then for mpi4py please follow our instructions:
https://researchcomputing.princeton.edu/support/knowledge-base/mpi4py
In short, is there a way to setup WarpX with openmpi instead of mpich?
At length, here is what I tried: I set up a conda environment using the command:
conda create -n warpx-openmpi -c conda-forge blaspp boost ccache cmake compilers git lapackpp "openpmd-api=*=mpi_openmpi*" python make numpy pandas scipy yt "fftw=*=mpi_openmpi*" pkg-config matplotlib mamba ninja pip virtualenv periodictable picmistandard
Where I simply changed mpich* to openmpi* in the command above, since I am hoping to use an openmpi module (openmpi/gcc/4.1.2) available on the cluster.
After installing the environment, I uninstalled mpi4py installed via conda (using conda remove --force mpi4py
) and installed it via pip (as per the instructions in the link above) and then ran my batch script:
#!/bin/bash
#SBATCH -J Warp_Turn_mpi
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=4
#SBATCH --time=00:10:00
#SBATCH --output warpx.%j.out
#SBATCH --mail-type=all
#SBATCH --mail-user=bjensen@pppl.gov
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# load conda environment
module purge
module load anaconda3/2024.2
module load openmpi/gcc/4.1.2
conda activate warpx-openmpi
srun python PICMI_inputs_1d.py
and got this error. Are there instructions for running with openmpi?
Hey @ax3l -- update after I thought I got Warp-X up and running. I created this profile for running on Stellar and installed dependencies via this bash script. I built a 1D python version of the code.
The code compiles and runs, but does so nonsensically (ie. won't pass tests, and notably, when I try to initialize a uniform distribution of particles the density at the first step is off by a factor of 8).
Is there anything I can send/do to help figure this out?
@ax3l -- Here's an example of my problem. I run a simulation on the Stellar cluster with a script (PICMI_inputs.py), which sets the initial density to a uniform 2e16 m^-3. When I run the simulation, the density is initialized to:
If the simulation is ran for a few hundred steps, the plasma potential begins to rise to absurd values:
The applied potential is only 50V RF, so a 3000V plasma potential doesn't make any sense... Have you seen anything like this before with WarpX? Do you have any ideas for where I should start looking for solutions?
For context, when I run this on my personal computer (or even on Stellar with WarpX installed via conda), the initial density is 2e16 and the potential evolves as expected. I'd like to get a compiled version up and running on Stellar to make use of MPI.
Thank you for any help!
I am looking to install and run WarpX on Princeton/PPPL's stellar cluster (information HERE). I wrote about this in an earlier question on the discussions page (see below).
After looking into this more, I rebuilt WarpX with the following commands:
and ran a test:
which failed with the output (see warpx_test.txt for the full test):
Looking at the WarpX documentation, I wonder if the source of the MPI error may have its root in the HPC environment. Would you be able to help me build an HPC profile for Stellar?
Discussed in https://github.com/ECP-WarpX/WarpX/discussions/4751