Describe the bug
I am noticing a problem with our dynamic load balancing feature or the resume feature which use the function transfer_norank_particles in mesh.tcceven in develop. The following screenshot is taken after running the benchmark 3D hydrostatic case with dynamic load balancing, or with resume, as they call mpi_domain_decompose(false) which further calls the transfer_norank_particles.
To Reproduce
Steps to reproduce the behavior:
Compile develop using MPI
Run with mpirun -n xx mpm any problem in benchmark
Activate resume to true or add "nload_balance_steps": 10 in mpm.json to fasten the dynamic load balancing.
See the error.
Expected behavior
We should not have arbitrary number of ptype as otherwise we cant do resume nor dynamic load balancing feature.
Screenshots
As you can see above, even though that most of the ptype received is correct, equal to 1 for 3D particles, there are some time that it receive a weird number, in this case, 1456803152, and thus, we cant retrieve the appropriate type with ParticleTypeName map. As indicated, there is a PMIX ERROR while receiving the particle.
Runtime environment (please complete the following information):
OS/Docker image: Ubuntu 20.04
Branch: develop
Additional context
This is the same reason why our nightly fails since #689 is merged. From today's nightly build of running 2D sliding block:
Describe the bug I am noticing a problem with our dynamic load balancing feature or the resume feature which use the function
transfer_norank_particles
inmesh.tcc
even indevelop
. The following screenshot is taken after running the benchmark 3D hydrostatic case with dynamic load balancing, or with resume, as they callmpi_domain_decompose(false)
which further calls thetransfer_norank_particles
.To Reproduce Steps to reproduce the behavior:
develop
using MPImpirun -n xx mpm
any problem in benchmark"nload_balance_steps": 10
inmpm.json
to fasten the dynamic load balancing.Expected behavior We should not have arbitrary number of ptype as otherwise we cant do resume nor dynamic load balancing feature.
Screenshots![image](https://user-images.githubusercontent.com/37140224/93889203-2084c700-fd13-11ea-87a7-339c78ddf9a6.png)
As you can see above, even though that most of the
ptype
received is correct, equal to 1 for 3D particles, there are some time that it receive a weird number, in this case, 1456803152, and thus, we cant retrieve the appropriate type withParticleTypeName
map. As indicated, there is a PMIX ERROR while receiving the particle.Runtime environment (please complete the following information):
Additional context This is the same reason why our nightly fails since #689 is merged. From today's nightly build of running 2D sliding block:![image](https://user-images.githubusercontent.com/37140224/93889918-fc75b580-fd13-11ea-8b03-53458f7dc4c1.png)