cholla-hydro / cholla

A GPU-based hydro code
https://github.com/cholla-hydro/cholla/wiki
MIT License
65 stars 32 forks source link

Outputting information to aide particle concatenation #405

Open mabruzzo opened 3 months ago

mabruzzo commented 3 months ago

To efficiently particle files, we need to know the total number of particles.

Currently, that means we load every file once before in order to allocate space in the resulting file (which can be very slow). It would be faster if we recorded the total number of particles when writing outputs.

In fact, it would actually be even better if we recorded the total number of particles per file in one centralized location (maybe just in file 0). That way, we could better parallelize concatenation (when using an arbitrary number of processes for concatenation that is totally unrelated to the number of processes used in the original simulation)

bvillasen commented 3 months ago

as far as I remember, the header of each file has the number of local particles, this means that you just need to read that value from the header of each file and not the entire data to compute the total number of particles. This shouldn't be a significant overhead.

mabruzzo commented 3 months ago

Your recollection is correct about what is stored. That is exactly what we currently do.

With that said, standard posix file operations can be extremely slow on parallel file systems. The overhead of simply opening and closing a file can be shockingly large (especially when other people are using the file system). This is most problematic when you have many thousands of files.

In my experience, the parallel systems on the Oakridge systems usually aren't bad. But I have had awful experiences with the LUSTRE filesystem on Frontera

bvillasen commented 3 months ago

I see. I haven't had the pleasure of using Frontera, but I always hear lovely things about it. In that case, I agree that doing an MPI_Allreduce to get n_particles_total when writing the header of the output files is a good idea.