Open hsinhaoHHuang opened 11 months ago
Although I am not sure whether this is the same issue, I found the results with GAMER_DEBUG ON
are not identical with GAMER_DEBUG OFF
when using GPU
(BITWISE_REPRODUCIBILITY
is on), while they can be identical when using CPU
.
This also happens in the test problem MHD_OrszagTangVortex
.
@JeiwHuang It is expected that GAMER_DEBUG
may change the simulation results on the machine-precision level since it enables several additional checks in the fluid solvers.
Description
The difference between the results of CPU and GPU is not in the machine precision for the test problem
Cosmic_Ray_Diffusion
.Procedure to reproduce the issue
1. Modify the source code
1-1.
cd src/
1-2.vim TestProblem/Hydro/CR_Diffusion/Init_TestProb_Hydro_CR_Diffusion.cpp
2. Compile
2-1.
cp ../example/test_problem/Hydro/CR_Diffusion/generate_make.sh .
2-2.vim generate_make.sh
--openmp=false --mpi=false --bitwise_reproducibility=true --gpu=false --double=false
for Case_CPU_SINGLE2-3.
sh generate_make.sh
2-4.make clean && make
3. Run
3-1.
cd ../bin/
3-2.cp -r ../example/test_problem/Hydro/CR_Diffusion .
3-3.cd CR_Diffusion
3-4.mv ../gamer .
3-5.vim Input__TestProb
CR_Diffusion_Type = 0
for a 3D Gaussian distributionCR_Diffusion_Mag_Type = 0
for the uniform magnetic fieldCR_Diffusion_V* = 4.0
for the background velocity3-6.
vim Input__Parameter
CR_DIFF_PARA = 0.00
to simplify problemOPT__BC_FLU_* = 1
to support background velocity3-7.
./gamer 1>>log 2>&1
4. Store the data
4-1.
mkdir Case_SINGLE_CPU
4-2.mv Data_0000* Case_SINGLE_CPU/
4-3.mv gamer Case_SINGLE_CPU/
4-4.mv Record__* Case_SINGLE_CPU/
4-5.cp Input__* Case_SINGLE_CPU/
5. Repeat for different combinations
5-1. Repeat Step 2 to Step 4 for the other three cases
--gpu=true --double=false
)--gpu=false --double=true
)--gpu=true --double=true
)6. Compare data
6-1. Compile the
tool/analysis/gamer_compare_data
for single precision and double precision and name them asGAMER_CompareData_SINGLE
andGAMER_CompareData_DOUBLE
6-2../GAMER_CompareData_DOUBLE -m -c -i Case_DOUBLE_CPU/Data_000010 -j Case_DOUBLE_GPU/Data_000010 -o Compare_DOUBLE_CGPU
6-3../GAMER_CompareData_SINGLE -m -c -i Case_SINGLE_CPU/Data_000010 -j Case_SINGLE_GPU/Data_000010 -o Compare_SINGLE_CGPU
Results
Compare_SINGLE_CGPU
:Compare_DOUBLE_CGPU
:Summary of tests I have done.
1
The difference is small initially, and the problem will appear suddenly between a few steps.
It is reproducible by rerunning it or restarting it from the middle snapshot.
The difference still exists when there is no OpenMP, no MPI,
FLU_BLOCK_SIZE_X=1
,FLU_GPU_NPGROUP=1
, and no AMR.2
The difference still exists even with
CR_DIFF_*= 0.0
.The difference still exists even with compilation option
CR_DIFFUSION
turned off.The problem seems related to the magnetic field, especially when the magnetic field is too strong.
BlastWave with MHD
andMHD_OrszagTangVortex
do not have this issue, as far as I have tested.3
The difference still exists even with a non-zero background velocity and a three-dimensional magnetic field.
src/Model_Hydro/CPU_Hydro/CPU_Shared_ConstrainedTransport.cpp:411
, indE_Upwind()
if ( FABS(FC_Vel) <= MAX_ERROR )
->if ( true )
: For some cases without background velocities, it can reduce the difference and the difference with double precision is 7 orders of magnitude smaller than the difference with single precision. But in general, it doesn't solve all the cases.4
GAMER_DEBUG
does not print anything weird.More details about test problem related parameters
For
CR_Diffusion_Type=0
(Gaussian distribution ball) andCR_Diffusion_Mag_Type=0
(uniform magnetic field),The difference is in machine precision for standard 2D diffusion,
CR_Diffusion_G{X,Y,Z}={1,1,0}
andCR_Diffusion_Mag{X,Y,Z}={5,5,0}
The problem appears when the Gaussian is 3D,
CR_Diffusion_G{X,Y,Z}={1,1,1}
andCR_Diffusion_Mag{X,Y,Z}={5,5,0}
The problem appears when the magnitude of the magnetic field becomes large,
CR_Diffusion_G{X,Y,Z}={1,1,0}
andCR_Diffusion_Mag{X,Y,Z}={10,10,0}
The difference is in machine precision for
CR_Diffusion_G{X,Y,Z}={1,1,0}
andCR_Diffusion_Mag{X,Y,Z}={1,0,0}
CR_Diffusion_G{X,Y,Z}={1,1,0}
andCR_Diffusion_Mag{X,Y,Z}={5,0,0}