GEOS-DEV / GEOS

GEOS Simulation Framework
GNU Lesser General Public License v2.1
222 stars 89 forks source link

[Bug] Parallel computation creates an erroneous hdf5 of wavefield with acoustic solver - seismic modeling #2073

Closed aguitton closed 2 years ago

aguitton commented 2 years ago

Describe the bug When running an acoustic simulation in parallel, the wavefields saved in the hdf5 are not reassembled correctly. The information is in the hdf5 but at the wrong spatial locations. Running on one node yields the correct result.

To Reproduce On sherlock (Stanford):

sbatch slurm_run.sh with slurm_run.sh

#!/bin/bash
#
#SBATCH --job-name=geosx
#
#SBATCH --time=00:30:00
#SBATCH --nodes=4
#SBATCH --ntasks=4
#SBATCH --gpus-per-node=1
#SBATCH --partition=gpu
source /home/groups/biondo/aguitton/to_source_gpu
nvidia-smi
echo “number of tasks ” ${SLURM_NTASKS}
OMP_NUM_THREADS=1 srun --mpi=pmi2 geosx -i awe.xml -x 2 -y 2 -z 1 -t runtime-report,max_column_width=200 &> task.log

where to_source_gpu is

module purge
module load system
module load cmake
module load openmpi/4.0.3
module load openblas/0.3.10
module load gcc/8.1.0
ml python/3.6.1 py-numpy/1.19.2_py36
module load cuda/10.2.89

and awe.xml is

<?xml version="1.0" ?>
<Problem>
  <Solvers>
    <!-- define the solver -->
    <!-- define the source coordinates -->
    <!-- define the time source frequency -->
    <!-- define the receiver coordinates -->
    <AcousticSEM
      name="acousticSolver"
      cflFactor="0.25"
      discretization="FE1"
      targetRegions="{ Region }"
      sourceCoordinates="{ { 505, 505, 305 }}"
      timeSourceFrequency="10.0"
      rickerOrder="2"
      outputSeismoTrace="0"
      dtSeismoTrace="0.0025"
      receiverCoordinates="{ { 505, 505, 805 },
                             { 505, 505, 355 } }"/>
  </Solvers>
  <!-- hexahedral mesh generated internally by GEOSX -->
  <Mesh>
    <InternalMesh
      name="mesh"
      elementTypes="{ C3D8 }"
      xCoords="{ 0, 1000 }"
      yCoords="{ 0, 1000 }"
      zCoords="{ 0, 1000 }"
      nx="{ 100 }"
      ny="{ 100 }"
      nz="{ 100 }"
      cellBlockNames="{ cb }"/>
  </Mesh>
  <Events
    maxTime="0.15">
    <!-- control the timestepping here with forceDt -->
    <PeriodicEvent
      name="solverApplications"
      forceDt="0.0025"
      target="/Solvers/acousticSolver"/>    
    <!-- two events to output pressure in an hdf5 file -->
     <PeriodicEvent
               name="timeHistoryCollection"
      forceDt="0.025"
      target="/Tasks/pressureCollection"/>
    <PeriodicEvent
      name="timeHistoryOutput"
      forceDt="0.025"
      target="/Outputs/timeHistoryOutput"/>    
  </Events>
  <NumericalMethods>
    <FiniteElements>
      <FiniteElementSpace
        name="FE1"
        order="1"/>
    </FiniteElements>
  </NumericalMethods>
  <ElementRegions>
    <CellElementRegion
      name="Region"
      cellBlocks="{ cb }"
      materialList="{ nullModel }"/>
  </ElementRegions>
  <Constitutive>
    <NullModel
      name="nullModel"/>
  </Constitutive>
  <FieldSpecifications>
    <!-- 1) The initial pressure field -->
    <FieldSpecification
      name="initialPressure"
      initialCondition="1"
      setNames="{ all }"
      objectPath="nodeManager"
      fieldName="pressure_n"
      scale="0.0"/>
    <FieldSpecification
      name="initialPressure_nm1"
      initialCondition="1"
      setNames="{ all }"
      objectPath="nodeManager"
      fieldName="pressure_nm1"
      scale="0.0"/>
    <!-- 2) The velocity in the domain -->
    <FieldSpecification
      name="cellVelocity"
      initialCondition="1"
      objectPath="ElementRegions/Region/cb"
      fieldName="mediumVelocity"
      scale="2000"
      setNames="{ all }"/>
  </FieldSpecifications>
  <!-- collect the pressure values at the nodes -->
  <Tasks>
           <PackCollection
      name="pressureCollection"
      objectPath="nodeManager"
      fieldName="pressure_np1"/>
  </Tasks> 
   <Outputs>
    <!-- output the pressure values to a file named pressure_history.hdf5  -->
     <TimeHistory
      name="timeHistoryOutput"
      sources="{ /Tasks/pressureCollection }"
      filename="pressure_history-awe1"/>
  </Outputs> 
</Problem>

Screenshots

These are time slices of the wavefield generated in a constant velocity medium

Running on four nodes

Screen Shot 2022-09-19 at 2 55 09 PM

Running on one node

Screen Shot 2022-09-19 at 2 55 53 PM

Platform (please complete the following information):

wrtobin commented 2 years ago

Please specify the workflow used to generate your plots and post any custom post-processing scripts used.

aguitton commented 2 years ago

Here is the python code for plotting the results.

import numpy as np
import math
import h5py
import scipy.ndimage as nd
import matplotlib.pyplot as plt
from matplotlib import rcParams
import ipywidgets as widgets

f=h5py.File('pressure_history-awe1.hdf5')
p=np.array(f['pressure_np1'])
f.close()

p=np.reshape(p,(60,101,101,101))

%matplotlib inline
d=p
dt=0.0005
ix=51
iy=51
iz=51
vmax=0.05

nt=d.shape[0]
nx=d.shape[1]
ny=d.shape[2]
nz=d.shape[3]
ixmin=15
ixmax=nx-1-ixmin
iymin=15
iymax=ny-1-iymin
izmin=15
izmax=nz-1-izmin

def plotWfld(d,it,output="figure",save=False):
    time=it*dt
    plt.figure(figsize=(16,6))
    plt.subplot(1,3,1)
    plt.imshow(np.transpose(d[it,ix,:,:]),vmin=-vmax,vmax=vmax,cmap='seismic')
    #plt.hlines((izmin,izmax),iymin,iymax,LineStyle='--')
    #plt.vlines((iymin,iymax),izmin,izmax,LineStyle='--')
    plt.xlabel("Y")
    plt.ylabel("Z")
    plt.title(r"X-slice @ %0.2f sec" %time)

    plt.subplot(1,3,2)
    plt.imshow(np.transpose(d[it,:,iy,:]),vmin=-vmax,vmax=vmax,cmap='seismic')
    #plt.hlines((izmin,izmax),ixmin,ixmax,LineStyle='--')
    #plt.vlines((ixmin,ixmax),izmin,izmax,LineStyle='--')
    plt.xlabel("X")
    plt.ylabel("Z")
    plt.title(r"Y-slice @ %0.2f sec" %time)

    plt.subplot(1,3,3)
    plt.imshow(np.transpose(d[it,:,:,iz]),vmin=-vmax,vmax=vmax,cmap='seismic')
    #plt.hlines((iymin,iymax),ixmin,ixmax,LineStyle='--')
    #plt.vlines((ixmin,ixmax),iymin,iymax,LineStyle='--')
    plt.xlabel("X")
    plt.ylabel("Y")
    plt.title(r"Z-slice @ %0.2f sec" %time)

    plt.tight_layout()
    if save==True:
        plt.savefig("./fig/"+output+"_%d" %it)

widgets.interact(plotWfld, d=widgets.fixed(d), it=(0,nt-1,1))
wrtobin commented 2 years ago

The TimeHistory hd5f output isn't implicitly structured based on the mesh, which this post-processing script seems to presuppose. The output file should include multiple data sets including -- when collecting from mesh entities -- coordinate information. This information needs to be used to plot data spatially.

Since the TimeHistory output is focused only on extracting information at the lowest-level possible, no restructuring or post-processing is done internally.

I'm closing this as "Not Planned" for now. If you use the nodal coordinate information and still have any issues with the output in parallel feel free to update / reopen this issue.

TotoGaz commented 2 years ago

@jhuang2601 You used to consider TimeHistory with locations of a fracture. Do you think your XP could help there?

jhuang2601 commented 2 years ago

@aguitton Using the following way, you might be able to get the nodal coordinate information directly from HDF5 file, as @wrtobin mentioned

    f = h5py.File(pressure_history-awe1.hdf5, 'r')
    xl_elm = f.get('pressure_np1 elementCenter')
    xl_elm = np.array(xl_elm)
    xcord_elm = xl_elm[0, :, 0]
    ycord_elm = xl_elm[0, :, 1]
    zcord_elm = xl_elm[0, :, 2]
aguitton commented 2 years ago

Great! Thank you @jhuang2601 and all. Will try that...

aguitton commented 2 years ago

I tried @jhuang2601's suggestion but the hdf5 doesn't contain the coordinates of the elements, only the values of the wavefields at the assigned time steps.

wrtobin commented 2 years ago

Please h5ls the file and provide the names of the datasets in the file. Note if you're collecting a nodal quantity it won't have element centers, only nodal coordinates.

aguitton commented 2 years ago

Got it. I didn't know about h5ls. I have coordinates now. Thank you @wrtobin

aguitton commented 2 years ago

Was able to reorder the array thanks to the coordinates and I get the correct wavefield.