SPECFEM / specfem3d

SPECFEM3D_Cartesian simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra (structured & unstructured).
https://specfem.org
GNU General Public License v3.0
415 stars 231 forks source link

Out of Memory issue while reading large no of source files #1752

Open padesh opened 1 month ago

padesh commented 1 month ago

Hi Team,

I am doing noise cross-correlation simulations for a coupled acoustic-elastic media. When I have large no of external source files (60k files, each with 60k time steps), the step where solver reads the source files, the simulations breaks due to out of memory issue. My question is, are these source files read on a single node or these are read in parallel? If these are read by single node, increasing the no of nodes would not help in this case.

Since almost half of the values are zeros on later part of the time in STF file , can it be done that if NSTEP > no of lines in STF file, the program injects zeros for those extra no of steps? That will help trim down total no of lines in STF file and its size.

or any other suggestions?

STF files are in binary format.

Thanks.

danielpeter commented 4 weeks ago

for external source time functions, all MPI processes allocate the same array containing all sources and time steps.

in your case, the size of this array becomes ~ 13 GB:

60000 * 60000 * 4 / 1024. / 1024. / 1024. = 13.411 

the number of time steps in the external STF file must be at least the number of time steps of the simulation. there is no fall back to zeros if it is shorter, instead the simulation would break.

the only help provided is that the solver can read binary files. you could store those files in binary format (***.bin file) to speed up the reading.

padesh commented 4 weeks ago

Thanks Peter. I am already using the .bin format for source files.

This means there is a limit to which you can use timesteps and no of sources, irrespective of no of nodes you can use.

danielpeter commented 4 weeks ago

what if you run multiple MPI processes on a single node, just spread out the processes onto more compute nodes and use fewer MPI processes on a single node? 13 GB doesn't sound awfully lot for a single compute node's memory.

padesh commented 4 weeks ago

you mean I try allocating nodes ( I have 36 cores of each node) and do task_per_node <36 (say 30) and then call srun xspecfem3D for solver?

danielpeter commented 4 weeks ago

yes, find out how much memory you have per node and then estimate how many MPI processes you can run on a single node, taking into account that each process will require additional memory for other arrays (mesh, seismograms, etc.).

padesh commented 4 weeks ago

Thanks, let me try this and come back with an update.