AMReX-Codes / amrex

AMReX: Software Framework for Block Structured AMR
https://amrex-codes.github.io/amrex
Other
519 stars 339 forks source link

The parallel efficiency of the function "cacheNeighborInfo" in "Particle" #3827

Open Horizonatpku opened 5 months ago

Horizonatpku commented 5 months ago

Hi everyone,

I have tested the parallel efficiency of the function cacheNeighborInfo used by the function fillNeighborsCPU

When the number of particles exceeds one million, cacheNeighborInfo in 32 cores consumes even longer time than that in 8 cores CPU. I wonder if it is because the function cacheNeighborInfo are written in OpenMP way instead of MPI. The same thing happens in the function Redistribute. Here is part of codes in cacheNeighborInfo:

  int thread_num = OpenMP::get_thread_num(); 
  const int& grid = pti.index();
  const int& tile = pti.LocalTileIndex();
  PairIndex src_index(grid, `tile);

I would be appreciate if you can give me some help.

Best wishes, Yifeng He