Describe the bug
When running a wave propagation solver (eg acoustic), if the source lies on the boundary of 2 or more elements, the amplitude of the source signal can be counted multiple times, or not a all, resulting in a wrong result.
This has been reported on issue #2256 but closed for duplicate reason with #2227. However, the problem persists. Could it be due to parallel concurrency?
To Reproduce
Launch numerous time an xml file with the following geometry and solver. The source is at the origin and the mesh is such that the origin is a mesh node. The complete file is provided at the end.
Here is a plot of different signal obtained through the same piece of code. The amplitude varies from 0 to the double. 99 simulations have been launched. The source is at the origin belonging to 8 elements, the receiver is close to at (1,2,3), inside an element.
Platform (please complete the following information):
Machine: pangea3
Compiler: tested with gcc
GEOSX Version: develop
Additional information
If a source is on a mesh node, then it belongs to 8 elements. Currently, GEOS loops on each element and check if the source is inside the element (boundary included). If so, GEOS pre-computes the parameters and quantities for the source. As the loop on the element is achieved in parallel, concurrency can appear.
The loop on the element is in function Compute1DSourceAndReceiverConstantsWithElementsAndRegionStorage in the file
vti/src/coreComponents/physicsSolvers/wavePropagation/PrecomputeSourcesAndReceiversKermel.hpp
In case it helps, here is a piece of log of Compute1DSourceAndReceiverConstantsWithElementsAndRegionStorage obtained through (ugly) printf:
// file recomputeSourcesAndReceiversKernel.hpp
// Function Compute1DSourceAndReceiverConstants
for( localIndex a = 0; a < numNodesPerElem; ++a )
{
sourceNodeIds[isrc][a] = elemsToNodes[k][a];
sourceConstants[isrc][a] = Ntest[a];
// ugly but useful printf :-)
printf("sourceNodeIds[%d][%d] = %d \n", isrc, a, elemsToNodes[k][a]);
printf("sourceConstants[%d][%d] = %.2f \n", isrc, a, Ntest[a]);
}
Part of the resulting printf, for the case where the source went to zero:
The source information sourceConstants[0][0] and sourceNodeIds[0][0] have been changed 8 times... Same for 8 other nodes.
Same problem for Receivers
The same problem happens for receivers. In addition, if a receiver is at the boundary between 2 subdomains (of the domain decomposition method) then GEOS crashes unexpectedly. To observe that, simply change the receiver's position to, eg, the origin: receiverCoordinates="{ { 0, 0, 0 }}"
Launch then in parallel with a ddm 2 in the x direction, for example:
mpirun -n 8 geosx -i issue.xml -x 2 -y 2 -z 2
You should get something like
***** ERROR
***** LOCATION: src/coreComponents/physicsSolvers/wavePropagation/WaveSolverUtils.hpp:108
***** Controlling expression (should be false): nReceivers != total
***** Rank 0: : Invalid distribution of receivers: nReceivers=1 != MPI::sum=8.
Maybe due to the check on ghostElement in Compute1DSourceAndReceiverConstantsWithElementsAndRegionStorage ?
Describe the bug When running a wave propagation solver (eg acoustic), if the source lies on the boundary of 2 or more elements, the amplitude of the source signal can be counted multiple times, or not a all, resulting in a wrong result. This has been reported on issue #2256 but closed for duplicate reason with #2227. However, the problem persists. Could it be due to parallel concurrency?
To Reproduce Launch numerous time an xml file with the following geometry and solver. The source is at the origin and the mesh is such that the origin is a mesh node. The complete file is provided at the end.
Screenshots
Here is a plot of different signal obtained through the same piece of code. The amplitude varies from 0 to the double. 99 simulations have been launched. The source is at the origin belonging to 8 elements, the receiver is close to at (1,2,3), inside an element.
Platform (please complete the following information):
Additional information
If a source is on a mesh node, then it belongs to 8 elements. Currently, GEOS loops on each element and check if the source is inside the element (boundary included). If so, GEOS pre-computes the parameters and quantities for the source. As the loop on the element is achieved in parallel, concurrency can appear.
The loop on the element is in function
Compute1DSourceAndReceiverConstantsWithElementsAndRegionStorage
in the filevti/src/coreComponents/physicsSolvers/wavePropagation/PrecomputeSourcesAndReceiversKermel.hpp
In case it helps, here is a piece of log of
Compute1DSourceAndReceiverConstantsWithElementsAndRegionStorage
obtained through (ugly)printf
:Part of the resulting printf, for the case where the source went to zero:
The source information
sourceConstants[0][0]
andsourceNodeIds[0][0]
have been changed 8 times... Same for 8 other nodes.Same problem for Receivers
The same problem happens for receivers. In addition, if a receiver is at the boundary between 2 subdomains (of the domain decomposition method) then GEOS crashes unexpectedly. To observe that, simply change the receiver's position to, eg, the origin:
receiverCoordinates="{ { 0, 0, 0 }}"
Launch then in parallel with a ddm 2 in the x direction, for example:mpirun -n 8 geosx -i issue.xml -x 2 -y 2 -z 2
You should get something like
Maybe due to the check on
ghostElement
inCompute1DSourceAndReceiverConstantsWithElementsAndRegionStorage
?xml complete file
The complete xml is the following
Thank you and sorry for the long issue!