Closed smressle closed 4 years ago
Related #44
Just confirmed that this is an issue both before and after the patch that Chris sent me from the development version that fixes the primitive/conservative conversion in GR and adds the conserved and primitive scalars to the hydro source term inputs.
The easiest way to reproduce it is to run the gr_torus problem using
./configure.py -mpi --cxx=icc --prob=gr_torus--flux=hlle -g -b --nscalars=4 --coord=kerr-schild
and modifying the default input file to add "refinement=static" to
mpirun -np 2 ./athena -i athinput.gr_torus.
I have found the issue. There is a slight error in mesh.cpp in this section of the code:
if (GENERAL_RELATIVITY && multilevel ) {
// prepare to receive primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pbval->StartReceiving(BoundaryCommSubset::gr_amr);
}
// send primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pmb->phydro->hbvar.SwapHydroQuantity(pmb->phydro->w,
HydroBoundaryQuantity::prim);
pmb->phydro->hbvar.SendBoundaryBuffers();
}
// wait to receive AMR/SMR GR primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pmb->phydro->hbvar.ReceiveAndSetBoundariesWithWait();
pbval->ClearBoundary(BoundaryCommSubset::gr_amr);
pmb->phydro->hbvar.SwapHydroQuantity(pmb->phydro->u,
HydroBoundaryQuantity::cons);
}
} // multilevel
Note that in StartReceiving and ClearBoundary, both the hydro and scalar boundary variables are treated identically. Thus a call to MPI_START is made for the req_recv request for the scalar variables but no scalar information is ever sent in this section of the code so it remains "open." Then in ClearBoundary MPI_WAIT is called for the req_send request that was never actually started. The simple fix below just sends/receives the scalars alongside the hydro primitives. Alternatively, ClearBoundary and StartReceiving could be altered to only send the hydro variables if the phase = gr_amr, though this would require some distinction be made between the two.
if (GENERAL_RELATIVITY && multilevel ) {
// prepare to receive primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pbval->StartReceiving(BoundaryCommSubset::gr_amr);
}
// send primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pmb->phydro->hbvar.SwapHydroQuantity(pmb->phydro->w,
HydroBoundaryQuantity::prim);
pmb->phydro->hbvar.SendBoundaryBuffers();
if (NSCALARS > 0) pmb->pscalars->sbvar.SendBoundaryBuffers();
}
// wait to receive AMR/SMR GR primitives
#pragma omp for private(pmb,pbval)
for (int i=0; i<nmb; ++i) {
pmb = pmb_array[i]; pbval = pmb->pbval;
pmb->phydro->hbvar.ReceiveAndSetBoundariesWithWait();
if (NSCALARS > 0) pmb->pscalars->sbvar.ReceiveAndSetBoundariesWithWait();
pbval->ClearBoundary(BoundaryCommSubset::gr_amr);
pmb->phydro->hbvar.SwapHydroQuantity(pmb->phydro->u,
HydroBoundaryQuantity::cons);
}
} // multilevel
@c-white any thoughts on applying your patch in https://github.com/PrincetonUniversity/athena-public-version/issues/44#issuecomment-621479219 to this repo and closing this? Is it fixed on the private version?
I seemed to have stumbled across a generic issue when attempting to run the code in GR with passive scalars, SMR turned on (I have not tried AMR), and MPI. The code never gets past the first stage in the integration loop and just hangs up forever. This happens even when the problem generator does nothing except set the initial conditions. All three things need to be turned on in order for the hang up to occur. I also had magnetic fields turned on, though I have not tried turning them off.
It didn't seem like there was an obvious problem in the time_integrator dependencies, but I will keep looking.
I use something like: ./configure.py -mpi --cxx=icc --prob=gr_test --flux=hlle -g -b --nscalars=4 --coord=kerr-schild -debug
and then run with at least 2 processors.
Note that this is with the patch from the development version.