CmPA / iPic3D

Particle-in-Cell code using the implicit moment method
72 stars 55 forks source link

restrict particle tracking to passive tracking species #69

Open alecjohnson opened 10 years ago

alecjohnson commented 10 years ago

Currently particle tracking is turned on per species via a TrackParticleID flag. If this flag is set, then an extra field, ParticleID, is tracked for each particle.

This approach makes it difficult to optimize performance due to the variable size of a particle and the need for conditional execution of code related to the particle ID in inner loops.

I see particles being used for three distinguishable purposes:

  1. active: summing moments to drive the fields
  2. passive: representing the distribution of particles
  3. passive: studying particle acceleration

Historically we have coupled these three functions. As far as I know, tracking particle IDs has been used only for the third purpose and is not used in the vast majority of our simulations.

Therefore, I propose that we separate tracking of particles from summing moments. That is, if TrackParticleID is turned on for a species, then we will automatically enforce that rhoINIT is zero for that species (in which case it makes no contribution to the summation of moments). In this way, we can reuse the charge for the particle ID, preventing us from having to extend the particle size to two cache lines, as detailed below.

I would like to be able to assume that particle data fits in a single cache line (8 doubles on Intel architectures). I am defining a particle as follows:

class SpeciesParticle
{
 private: // data
  double u; // velocity
  double v;
  double w;
  double q; // charge
  double x; // position
  double y;
  double z;
  double t; // subcycle time
 public: // methods
  [...]
};

This exactly fits a cache line, which accelerates sorting of particles.

Additionally, I think that we should support saving tracking particles with a different frequency compared to other particles. I suggest that we change the ParticlesOutputCycle field in e.g. inputfiles/GEM.inp to work something like this:

# initial density
rhoINIT =  1  1  0  0
# TrackParticleID[species] = 1=true, 0=false --> Assign ID to particles
TrackParticleID = 0  0  1  1
# qom = charge to mass ratio for different species
qom =  -256   1  -256  1
#0 means never save, 1 means save every cycle 
ParticlesOutputCycle  0  0  1  1

If field data is saved with every time step and the mover is sufficiently close to time-reversible (i.e. is iterated to convergence), then we can track fast particles backward in time to see how they were accelerated and there is no need for tracking particle IDs. But if we do not save field data with every time step, then we need to save particle IDs if we want to track how particles are accelerated. One possibility would be to observe which particles get accelerated and then rerun the simulation, saving those particles with every time step. (Note that if field data is saved with every time step and particles carry ID information, then we could rerun forward particle evolution just for the fast particles, saving them with each time step; this would avoid the need to use a time-reversible pusher.)

Handling of tracking and non-tracking particles would differ in the following ways:

Here is an example of how to generate unique particle IDs:

class ParticleIDgenerator
{
  long long* counter;
 public: 
  void init(long long lowest, long long highest = std::numeric_limits<long long>::max())
  {
     int num_procs = MPIdata::get_nprocs();
     long long max_pcls = highest - lowest; // adding 1 to difference could overflow...
     long long max_pcls_per_proc = max_pcls / num_procs;
     long long counter_proc_offset = lowest + max_pcls_per_proc * MPIdata::get_rank();
     long long max_pcls_per_thread = max_pcls_per_proc / omp_get_max_threads();
     counter = new long long[number_of_threads_per_process];
     for(int i=0;i<number_of_threads_per_process;i++)
     {
       counter[i] = counter_proc_offset + i*max_pcls_per_thread;
     }
  }
  long long get_ID()
  {
    return counter[omp_get_thread_num()]++;
  }
}

Supposing that the initial set of particles is a predetermined number initial_number_of_pcls (e.g. initial_number_of_pcls = nop*total_number_of_processes), the initial set of particles could have consecutive IDs from 0 to initial_number_of_pcls-1; this class could then be initialized as:

  particleIDgenerator.init(initial_number_of_pcls);

There is an issue of whether to cast particle IDs as (long long) or simply to work with doubles. With double, there is an issue that consecutive integers are representable only up to 2^53, whereas with long long, consecutive integers are representable up to 2^64. A billion processors each generating a billion particles would take us to 2^60 particles. The one conceivable purpose that I can see for saving such a huge number of particles with IDs is so that after running the simulation and identifying which particles get accelerated, we can go back again and run it again, tracking how those particles get accelerated (and this time saving them with each time step).

In any case, I want particle IDs to be stored and communicated as double. I want to avoid use of MPI_Pack out of concern for vectorization. This means that either we either use double for the ID and give up on supporting more than 2^53 consecutive integer IDs or we cast to/from long long when communicating and give up on correct MPI communication of particle IDs between big-endian and little-endian architectures (hard to imagine a use case for this).