ClimateGlobalChange / tempestremap

TempestRemap: Remapping software for climate applications
Other
41 stars 29 forks source link

Bit for bit issues #116

Closed iulian787 closed 2 months ago

iulian787 commented 1 year ago

We are trying to make all computations robust enough so that if we run on different number of processes, or on a different layout, we need to obtain the same results; so we need a precision more than offered by "machine epsilon", we need to have the so-called bit-for-bit robustness This is in general hard to achieve, when there are many operations involved, multiplications, additions; Basically, in a computer setting, additions and multiplications are not always commutative.

Before we go to serial-parallel difference in results, we would like to understand more about intersection, and mapping, and where the differences might come from

In this example, we compute a bilinear map between a CS mesh and a polygonal mesh on a sphere, for 2 source meshes, that differ only by the order in which first element nodes are described. Instead of 9, 189, 69, 1, the first cell connectivity is 1, 9, 189, 69. This is the same mesh, no ? But we do get 2 different maps in the end :(

I am attaching the script I am running, I will try attaching the files too


 MOAB_DIR=/home/iulian/lib/moab/spack20/bin
TEMPESTREMAP_DIR=${MOAB_DIR}
#  GenerateICOMesh --res 12 --dual
# outICOMesh.g
# GenerateCSMesh --res 16
# outCSMesh.g
# ncdump outCSMesh.g > outCSMesh.g.ncdump
# modify outCSMesh.g.ncdump to outCSMesh_mod.g.ncdump
# diff outCSMesh.g.ncdump outCSMesh_mod.g.ncdump
#   1595c1595
#   <   1, 9, 189, 69,
#   ---
#   >   9, 189, 69, 1,

# ncgen -b -k3 outCSMesh_mod.g.ncdump -o outCSMesh_mod.g
# ncgen -b -k3 outCSMesh.g.ncdump -o outCSMesh_org.g

SRCMESH_TR="outCSMesh_org.g"
SRCMESH_TR_MOD="outCSMesh_mod.g"
TGTMESH_TR="outICOMesh.g"
$TEMPESTREMAP_DIR/GenerateOverlapMesh --b $SRCMESH_TR --a $TGTMESH_TR --method exact --allow_no_overlap --out overlap.g
$TEMPESTREMAP_DIR/GenerateOfflineMap --in_mesh $SRCMESH_TR --out_mesh $TGTMESH_TR --ov_mesh overlap.g --method bilin --out_map map_weights_bilin_tr.nc
$TEMPESTREMAP_DIR/GenerateOverlapMesh --b $SRCMESH_TR_MOD --a $TGTMESH_TR --method exact --allow_no_overlap --out overlap_mod.g
$TEMPESTREMAP_DIR/GenerateOfflineMap --in_mesh $SRCMESH_TR_MOD --out_mesh $TGTMESH_TR --ov_mesh overlap_mod.g --method bilin --out_map map_weights_bilin_tr_mod.nc

$MOAB_DIR/mbcmpmaps -i map_weights_bilin_tr.nc -j map_weights_bilin_tr_mod.nc

The differences we see now in the map files are these:

 $MOAB_DIR/mbcmpmaps -i map_weights_bilin_tr.nc -j map_weights_bilin_tr_mod.nc
 opened map_weights_bilin_tr.nc for map 1 
 opened map_weights_bilin_tr_mod.nc for map 2 
 n_a, n_b, n_s : 1536, 1442, 5767 for map 1 
 n_a, n_b, n_s : 1536, 1442, 5767 for map 2 
 euclidian norm for difference: 0 
 squared norm for difference: 0
 minv: 0 maxv: 0
frac_a diff norm: 3.33066907e-16 min at 1 : -2.22044605e-16 max at 0 : 2.22044605e-16
frac_b diff norm: 0 min at 0 : 0 max at 0 : 0
area_a diff norm: 3.46944695e-18 min at 0 : -3.46944695e-18 max at 1 : 0
area_b diff norm: 3.46944695e-18 min at 450 : -3.46944695e-18 max at 0 : 0

So we just changed one cell, gave a circular permuted order, and the area of some cells is different. Which just shows that the areas computed by TR are different between the 2 cells: 1, 9, 189, 69, has a diferent area compared to 9, 189, 69, 1

S values in maps are the same, just the areas are different

iulian787 commented 2 months ago

the new test I performed showed that the map is the same, if just one cell connectivity is permuted. Areas computed are slightly different for both? source and target, even though one cell was modified in source. Still, we are closing the issue