Closed rui-coelho closed 6 months ago
Hi Rui, you can separately compile the main program ascot5_main with MPI=1 to benefit from the MPI. You can then store this binary somewhere safe and then make clean & make libascot MPI=0 to get the full user experience.
The issue right now is that the libascot does not work with MPI=1. To be more specific, libascot should not care about MPI at all but there is one routine that is MPI parallelized which should be rewritten without MPI. We'll have to do it before ASCOT5 is integrated to IMAS, so I'll steal this issue for that purpose.
As for the CPU time, you can obtain it from getstate("cputime", state="end"). It will give you the value how long (in real time) the marker was simulated and that value is what the endcond correspond to.
Thanks for the suggestions ! I'll try then now. Question (maybe i missed it): does the documentation provide the several options for getstate ? At first sight i only found the usual r,zphi, ekin, mileage......
Yeah I've been struggling to figure out where they would be documented so that the documentation would be automatically updated whenever there is a new quantity implemented. Right now the solution is to call getstate_list() which lists all available quantities https://ascot4fusion.github.io/ascot5/postproc.html#ini-and-endstate
And how does one work with MPI specifying the number of nodes ? Can i simply specify the number of nodes or tasks in SLURM and run ascot5_main with mpirun or srun ?
Yup! Here's an example batch script: https://ascot4fusion.github.io/ascot5/simulations.html#batch-jobs
Since the memory is not shared between MPI processes, the recommendation is to have single MPI process per node.
After c9defc6bfcdca328ab7710e22d15b8a90c535e95 the develop now has the fix that enables using MPI=1 for libascot withous spawning ghost processes on Gateway.
As i recall, to have a complete experience using the ASCOT5 environment we are currently forced to compile with MPI=0, This translates to reasonable running times (~3.5h using low resolution 60 point grid in r,z,vpar,vperp and 200k markers) in a single node (48 cores) at the Gateway as long as the beam energies are low i.e. 85keV. However, the same run using 500keV beams is telling me it will require something like ~77h (!!!)
My end conditions are "safe" to get as much as possible all markers ionised (i used the same for both 85 and 500keV beams) i.e.
Related question: is the CPU time used to simulate each marker saved somewhere in the state ? For the 85keV beams the mileage is below 200ms (peak at ~50ms). But i don't know how much CPU it took unless i estimate it from the stdout but that's very imprecise (not enough digits in the fractional hours spent and not very meaningful anyway since don't discriminate end condition reached)...