ECP-WarpX / WarpX

WarpX is an advanced, time-based electromagnetic & electrostatic Particle-In-Cell code.
https://ecp-warpx.github.io
Other
275 stars 175 forks source link

Creating a `WarpX` Profile for the Karolina HPC System #4006

Open berceanu opened 1 year ago

berceanu commented 1 year ago

Hi, we're trying to get WarpX up and running on the GPU partition of the Karolina HPC system, situated in the Czech Republic. Karolina uses the PBS Pro workload manager and the Lmod environment module system.

We could really use your help in creating a WarpX build and runtime setup that works smoothly on Karolina. It would be awesome if we could create a new profile for Karolina to add to the HPC systems documentation🚀. This would include the right Lmod commands, build instructions for Karolina's 576 A100@40GB, and a working PBS job script.

We think this could potentially be of help not only for our team, but also for anyone in the community who has access to Karolina. We're all set to work with you on this, testing things out on Karolina, and creating the documentation PR. We understand that this is a request, so thank you for giving it some thought.

ax3l commented 1 year ago

Hi @berceanu, that is a great idea and I'll give you a hand with this.

I just started documenting a new cluster in #3938 / #4010; here, can you post the details for the same questions here that I asked in #3938? :)

berceanu commented 1 year ago

Hi @ax3l, that sounds great, thanks a lot! I've gone through issue #3938 and noted the details you need. Here's the equivalent information for the Karolina HPC system.

Karolina is a Linux-based mixed system, consisting of a CPU as well a GPU partition. For hardware details of the computational nodes, see compute-nodes. There is a publicly available documentation on this page, containing links to the hardware overview, environment and modules and job submission. For a list of the available modules on the system, see this gist.

We plan to use WarpX on the 72 GPU nodes, which contain 8 A100@40 GB each. We also plan to use PIConGPU, as well as several CPU codes such as EPOCH and Smilei, test their scalability, run benchmarks and compare. You can find some example job scripts here.

Finally, in regards to WarpX features, I think we are going for a "full-flavored install" on the GPU partition, including pseudo-spectral PSATD field solvers and all geometries (1D/2D/3D/RZ) 😃

berceanu commented 1 year ago

I am adding some additional info about the GPU queue on Karolina. Using the qgpu queue, one can allocate 1/8 of the node: 1 GPU and 16 cores. For more details, see allocation of vnodes on qgpu. For a list of all the queues, see queues.