Open Sizheng-Ma opened 2 years ago
Can you run the same case in debug mode?
Can you run the same case in debug mode?
Nothing gets printed.
The submission script:
#!/bin/bash -
#SBATCH -o spectre.stdout
#SBATCH -e spectre.stderr
#SBATCH --ntasks-per-node 32
#SBATCH -J KerrSchild
#SBATCH --nodes 1
#SBATCH -p any
#SBATCH -t 00:10:00
#SBATCH -D .
#SBATCH -A sxs
#SBATCH --mem=92GB
# Replace these paths with the path to your build directory, to the source root
# directory, the spectre dependencies module directory, and to the directory
# where you want the output to appear, i.e. the run directory.
# E.g., if you cloned spectre in your home directory, set
# SPECTRE_BUILD_DIR to ${HOME}/spectre/build. If you want to run in a
# directory called "Run" in the current directory, set
# SPECTRE_RUN_DIR to ${PWD}/Run
export SPECTRE_BUILD_DIR=${HOME}/spectre/build/
export SPECTRE_MODULE_DIR=${HOME}/DEPS_new/modules/
export SPECTRE_HOME=${HOME}/spectre/
export SPECTRE_RUN_DIR=${PWD}/Run
# Choose the executable and input file to run
# To use an input file in the current directory, set
# SPECTRE_INPUT_FILE to ${PWD}/InputFileName.yaml
export SPECTRE_EXECUTABLE=${PWD}/CharacteristicExtract
export SPECTRE_INPUT_FILE=${PWD}/KerrSchildWithCce.yaml
# These commands load the relevant modules and cd into the run directory,
# creating it if it doesn't exist
source ${SPECTRE_HOME}/support/Environments/caltech_hpc_gcc.sh
module use ${SPECTRE_MODULE_DIR}
spectre_load_modules
module list
mkdir -p ${SPECTRE_RUN_DIR}
cd ${SPECTRE_RUN_DIR}
# Copy the input file into the run directory, to preserve it
cp ${SPECTRE_INPUT_FILE} ${SPECTRE_RUN_DIR}/
# Set desired permissions for files created with this script
umask 0022
# Set the path to include the build directory's bin directory
export PATH=${SPECTRE_BUILD_DIR}/bin:$PATH
# Generate the nodefile
echo "Running on the following nodes:"
echo ${SLURM_NODELIST}
touch nodelist.$SLURM_JOBID
for node in $(echo $SLURM_NODELIST | scontrol show hostnames); do
echo "host ${node}" >> nodelist.$SLURM_JOBID
done
WORKER_THREADS_PER_NODE=$((SLURM_NTASKS_PER_NODE - 1))
WORKER_THREADS=$((SLURM_NPROCS - SLURM_NNODES))
SPECTRE_COMMAND="${SPECTRE_EXECUTABLE} +isomalloc_sync ++np ${SLURM_NNODES} \
++p ${WORKER_THREADS} ++ppn ${WORKER_THREADS_PER_NODE} \
++nodelist nodelist.${SLURM_JOBID}"
# When invoking through `charmrun`, charm will initiate remote sessions which
# will wipe out environment settings unless it is forced to re-initialize the
# spectre environment between the start of the remote session and starting the
# spectre executable
echo "#!/bin/sh
source /home/sma/spectre/support/Environments/caltech_hpc_gcc.sh
module use ${SPECTRE_MODULE_DIR}
spectre_load_modules
\$@
" > runscript
chmod u+x ./runscript
charmrun ++runscript ./runscript ${SPECTRE_COMMAND} \
--input-file ${SPECTRE_INPUT_FILE}
The input file
Evolution:
#InitialTime: 0.0
InitialTimeStep: 0.1
InitialSlabSize: 0.1
#TimeStepper: RungeKutta3
# AdamsBashforthN:
# Order: 3
Observers:
VolumeFileName: "GhKerrSchildVolume"
ReductionFileName: "GhKerrSchildReductions"
Cce:
Evolution:
TimeStepper:
AdamsBashforthN:
Order: 3
StepChoosers:
- Constant: 1.0
- Increase:
Factor: 2
- ErrorControl(SwshVars):
AbsoluteTolerance: 1e-8
RelativeTolerance: 1e-6
MaxFactor: 2
MinFactor: 0.25
SafetyFactor: 0.9
- ErrorControl(CoordVars):
AbsoluteTolerance: 1e-8
RelativeTolerance: 1e-7
MaxFactor: 2
MinFactor: 0.25
SafetyFactor: 0.9
StepController:
BinaryFraction
StartTime: Auto
EndTime: Auto
BoundaryDataFilename: "/central/groups/sxs/sma/cce_bh/no_id_new/test/c2/BondiCceR0198.h5"
LMax: 14
ExtractionRadius: Auto
NumberOfRadialPoints: 39
ObservationLMax: 4
InitializeJ:
InverseCubic
Filtering:
RadialFilterHalfPower: 24
RadialFilterAlpha: 35.0
FilterLMax: 12
ScriInterpOrder: 5
ScriOutputDensity: 1
H5Interpolator:
BarycentricRationalSpanInterpolator:
MinOrder: 10
MaxOrder: 10
H5LookaheadTimes: 200
H5IsBondiData: True
FixSpecNormalization: False
Note that the run also needs a worldtube data. It can be found at /central/groups/sxs/sma/cce_bh/no_id_new/test/c2/BondiCceR0198.h5
, which corresponds to a static Schwarzschild BH generated by SpEC.
The SpECTRE I'm using can be found at https://github.com/Sizheng-Ma/spectre/tree/cce_gh_executable_gh_gts
okay, I find that CharacteristicExtract runs very quickly on one core and slows down significantly on more than one core. The problem is exacerbated as the radial resolution is increased. @moxcodes did note in his tutorial that CCE does not scale beyond 4 cores, but going from one to two cores significantly slows down for larger of NumberOfRadialPoints
@nilsdeppe is cleaning up a branch to do profiling so we can investigate this further
Bug reports:
Expected behavior:
Current behavior:
This issue is related to #3782. I'm trying to run the
CharacteristicExtract
on CaltechHPC. The CCE grid is as followsRunning on a single node, the system proceeds 55M within 10mins. However, it suddenly slows down by a factor of 5 after I add one more radial grid point. When the radial points are more than 32, the system doesn't evolve at all. I haven't seen the same issue on other machines.
I’ve tested a few cases with radial points:
33,32,31,30,29,28,23,18
. Within 10mins, they proceed0M, 6.1M, 9.1M, 5.5M, 11M, 55M, 59M, 62M
.Environment:
Using all modules in
caltech_hpc_gcc.sh
Detailed discussion: