Closed jameswurster closed 3 years ago
Congratulations on raising your first issue on the phantom tracker!
would it be possible to post the files needed to reproduce this issue?
These simulations were produced using the included .in and .setup files (Due to a small oversight, they did not use the normal defaults for the collapse simulations, but both tests used the same values.)
The setup block is SETUP=jet All tests were run locally on my MacBook Pro, compiled using SYSTEM=gfortran
For openMP: export OMP_NUM_THREADS=6 ./phantom jet.in
For hybrid (compiling with MPI=yes): export OMP_NUM_THREADS=3 mpiexec -np 2 ./phantom jet.in
jet.setup:
# input file for sphere-in-box setup routines
# units
dist_unit = 1.0d16cm ! distance unit (e.g. au)
mass_unit = solarm ! mass unit (e.g. solarm)
# resolution
np = 300000 ! requested number of particles in sphere
# options for box
xmin = -8.000 ! x min
xmax = 8.000 ! x max
ymin = -8.000 ! y min
ymax = 8.000 ! y max
zmin = -8.000 ! z min
zmax = 8.000 ! z max
# intended result
form_binary = F ! the intent is to form a central binary
# options for sphere
use_BE_sphere = F ! centrally condense as a BE sphere
r_sphere = 4.000 ! radius of sphere in code units
density_contrast = 30. ! density contrast in code units
totmass_sphere = 1.000 ! mass of sphere in code units
cs_sphere_cgs = 2.189E+04 ! sound speed in sphere in cm/s
angvel = 1.770E-13 ! angular velocity in rad/s
masstoflux = 5.000 ! mass-to-magnetic flux ratio in units of critical value
ang_Bomega = 180. ! Angle (degrees) between B and rotation axis
jet.in:
# Runtime options file for Phantom, written 24/07/2020 10:35:38.8
# Options not present assume their default values
# This file is updated automatically after a full dump
# job name
logfile = jetomp01.log ! file to which output is directed
dumpfile = jetomp_00000.tmp ! dump file to start from
# options controlling run time and input/output
tmax = 10. ! end time
dtmax = 0.0888576587635 ! time between dumps
nmax = -1 ! maximum number of timesteps (0=just get derivs and stop)
nout = -1 ! number of steps between dumps (-ve=ignore)
nmaxdumps = 2 ! stop after n full dumps (-ve=ignore)
twallmax = 000:00 ! maximum wall time (hhh:mm, 000:00=ignore)
dtwallmax = 012:00 ! maximum wall time between dumps (hhh:mm, 000:00=ignore)
nfulldump = 10 ! full dump every n dumps
iverbose = 0 ! verboseness of log (-1=quiet 0=default 1=allsteps 2=debug 5=max)
# options controlling accuracy
C_cour = 0.300 ! Courant number
C_force = 0.250 ! dt_force number
tolv = 1.000E-02 ! tolerance on v iterations in timestepping
hfact = 1.200 ! h in units of particle spacing [h = hfact(m/rho)^(1/3)]
tolh = 1.000E-04 ! tolerance on h-rho iterations
tree_accuracy = 0.500 ! tree opening criterion (0.0-1.0)
# options controlling hydrodynamics, artificial dissipation
alpha = 1.000 ! MINIMUM art. viscosity parameter
alphamax = 1.000 ! MAXIMUM art. viscosity parameter
alphaB = 1.000 ! art. resistivity parameter
psidecayfac = 1.000 ! div B diffusion parameter
overcleanfac = 1.000 ! factor to increase cleaning speed (decreases time step)
hdivbbmax_max = 1.000 ! max factor to decrease cleaning timestep propto B/(h|divB|)
beta = 2.000 ! beta viscosity
avdecayconst = 0.100 ! decay time constant for viscosity switches
# options controlling damping
idamp = 0 ! artificial damping of velocities (0=off, 1=constant, 2=star)
# options controlling equation of state
ieos = 1 ! eqn of state (1=isoth;2=adiab;3=locally iso;8=barotropic)
mu = 2.381 ! mean molecular weight
# options controlling sink particles
icreate_sinks = 0 ! allow automatic sink particle creation
h_soft_sinksink = 0.000 ! softening length between sink particles
f_acc = 0.800 ! particles < f_acc*h_acc accreted without checks
# options relating to external forces
iexternalforce = 0 ! 1=star,2=coro,3=bina,4=prdr,5=toru,6=toys,7=exte,8=spir,9=Lens,10=dens,11=Eins,
# options controlling physical viscosity
irealvisc = 0 ! physical viscosity type (0=none,1=const,2=Shakura/Sunyaev)
shearparam = 0.100 ! magnitude of shear viscosity (irealvisc=1) or alpha_SS (irealvisc=2)
bulkvisc = 0.000 ! magnitude of bulk viscosity
# options for injecting/removing particles
rkill = -1.000 ! deactivate particles outside this radius (<0 is off)
so one thing here is that the tree is just constructed differently with MPI compared to openMP, so it's possible it's just a slight difference in the gravity calculation?
These difference look big enough that I would suspect that they are not simply due to a difference in the tree build. Rhomax is similar for both, and rho_ave is included below (with corrected legend). Also, Bmax is similar for both, but h divB / B is really bad for the hybrid run (both the maximum and average values). I wonder if this is related to some of the current build issues....
An additional issue is easily reproducible by running the nonidealmhd test. For the first test, different dt_courant values are produced depending on MPI=yes vs MPI=no. Both values are included in the test script so that all tests pass. I have yet to look into this.
EDIT: 23 March 2021:
This is no longer an issue once we removed the race condition when calculating divcurlB.
These issues remain even without the race condition when calculating divcurlB. Decreasing tree_accuracy help, but not nearly enough; it has trivial affect on the openmp-only model, by noticable affect on the hybrid model.
Note: The above plots used ieos=1, when I meant to use ieos=8. All comments are independent of ieos.
This does not appear to be an MHD issue, since hydro versions of these collapses yield similar conclusions.
Note: to run the first 15 steps takes 1.9min using openmp but 4.2min for the hybrid run; for the purely hydro collapses, it takes 64s using openmp but 2.3min for the hybrid run. I have not tested how well these scale in time for larger and longer simulations.
these differences are likely because particle waking is not implemented in MPI (#142), closing this issue as is likely a duplicate of #142
To test MPI, I modelled the gravitational collapse of a magnetised sphere. My runs were pure openMP (mislabelled as 'hybrid' on the line plots) and hybrid MPI+openMP (labelled as MPI). The attached graphs show the kinetic and potential energies over the early stages of the collapse, and the density at dump 10 as plotted by splash (red = openMP. black = hybrid). Clearly the results are not the same, suggesting there is a bug in the MPI implementation.