underworldcode / underworld3

https://underworldcode.github.io/underworld3/
Other
21 stars 10 forks source link

Spherical benchmarks petsc error: Unsupported polytope type tetrahedron #174

Closed gthyagi closed 7 months ago

gthyagi commented 8 months ago

Hi @knepley,

I am trying to run Spherical Benchmarks: Isoviscous Incompressible Stokes from Benchmark paper. I was able to run these models upto cellsize=1/32 on 64 cpus. However, if I run the same job with cellsize=1/64 on 528 cpus then following error occurs

[331]PETSC ERROR: Unsupported polytope type tetrahedron
[331]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[331]PETSC ERROR:   Option left: name:-stokes_fieldsplit_pressure_pc_gasm_type value: basic source: code
[331]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[331]PETSC ERROR: Petsc Development GIT revision: 58e802b8d5ef8f9797144ffefeee74e29253ae4a  GIT Date: 2024-03-17 16:41:22 +0000
[331]PETSC ERROR: /home/565/tg7098/uw3_envs_models/Spherical_Benchmark_Kramer/Run_Convergence_Scripts/Ex_Stokes_Spherical_Benchmark_Kramer_RCS.py on a arch-linux-c-opt named gadi-cpu-clx-2246.gadi.nci.org.au by tg7098 Mon Mar 25 16:35:04 2024
[331]PETSC ERROR: Configure options --with-debugging=0 --COPTFLAGS="-g -O3" --CXXOPTFLAGS="-g -O3" --FOPTFLAGS="-g -O3" --with-petsc4py=1 --with-zlib=1 --with-shared-libraries=1 --with-cxx-dialect=C++11 --with-make-np=4 --with-hdf5-dir=/apps/hdf5/1.12.2p --download-mumps=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-hypre=1 --download-scalapack=1 --download-superlu_dist=1 --download-pragmatic=1 --download-ctetgen --download-eigen --download-superlu=1 --download-triangle --useThreads=0
[331]PETSC ERROR: #1 DMPlexRefineRegularGetAffineTransforms() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/impls/plex/transform/impls/refine/regular/plexrefregular.c:253
[331]PETSC ERROR: #2 PetscFESetUp_Composite() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/dt/fe/impls/composite/fecomposite.c:36
[331]PETSC ERROR: #3 PetscFESetUp() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/dt/fe/interface/fe.c:275
[331]PETSC ERROR: #4 PetscFERefine() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/dt/fe/interface/fe.c:1803
[331]PETSC ERROR: #5 DMPlexComputeInterpolatorNested() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/impls/plex/plexfem.c:2738
[331]PETSC ERROR: #6 DMCreateInterpolation_Plex() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/impls/plex/plex.c:9899
[331]PETSC ERROR: #7 DMCreateInterpolation() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/dm/interface/dm.c:1214
[331]PETSC ERROR: #8 PCSetUp_MG() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/pc/impls/mg/mg.c:996
[331]PETSC ERROR: #9 PCSetUp() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/pc/interface/precon.c:1079
[331]PETSC ERROR: #10 KSPSetUp() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/interface/itfunc.c:415
[331]PETSC ERROR: #11 KSPSolve_Private() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/interface/itfunc.c:831
[331]PETSC ERROR: #12 KSPSolve() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/interface/itfunc.c:1078
[331]PETSC ERROR: #13 PCApply_FieldSplit_Schur() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c:1203
[331]PETSC ERROR: #14 PCApply() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/pc/interface/precon.c:497
[331]PETSC ERROR: #15 KSP_PCApply() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/include/petsc/private/kspimpl.h:409
[331]PETSC ERROR: #16 KSPFGMRESCycle() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:123
[331]PETSC ERROR: #17 KSPSolve_FGMRES() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/impls/gmres/fgmres/fgmres.c:235
[331]PETSC ERROR: #18 KSPSolve_Private() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/interface/itfunc.c:905
[331]PETSC ERROR: #19 KSPSolve() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/ksp/ksp/interface/itfunc.c:1078
[331]PETSC ERROR: #20 SNESSolve_NEWTONLS() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/snes/impls/ls/ls.c:220
[326]PETSC ERROR: #21 SNESSolve() at /scratch/m18/tg7098/v_envs/venv_uw3_dev_18_3_24/petsc/src/snes/interface/snes.c:4733
/ls.c:220

Here are the input and log files. Ex_Stokes_Spherical_Benchmark_Kramer_RCS.py.txt jobscript_m18.sh.e111716764.txt jobscript_m18.sh.o111716764.txt

knepley commented 8 months ago

This is my fault. Fixing now

knepley commented 8 months ago

Here is the fix. Once I make a test, I will get the MR merged.

https://gitlab.com/petsc/petsc/-/merge_requests/7409

gthyagi commented 8 months ago

@knepley I got this error with the latest main branch of Petsc.

[4]PETSC ERROR: ------------------------------------------------------------------------
[4]PETSC ERROR: [96]PETSC ERROR: [288]PETSC ERROR: ------------------------------------------------------------------------
[288]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[288]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[288]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[288]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[288]PETSC ERROR: to get more information on the crash.
[288]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
------------------------------------------------------------------------
Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[4]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[4]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[4]PETSC ERROR: to get more information on the crash.
[96]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[96]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[4]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
[160]PETSC ERROR: [96]PETSC ERROR: [224]PETSC ERROR: ------------------------------------------------------------------------
[160]PETSC ERROR: ------------------------------------------------------------------------
[224]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[96]PETSC ERROR: [160]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
Try option -start_in_debugger or -on_error_attach_debugger
configure using --with-debugging=yes, recompile, link, and run
[224]PETSC ERROR: [96]PETSC ERROR: [160]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
to get more information on the crash.
or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[96]PETSC ERROR: [160]PETSC ERROR: [224]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
Run with -malloc_debug to check if memory corruption is causing the crash.
configure using --with-debugging=yes, recompile, link, and run
[224]PETSC ERROR: [160]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
to get more information on the crash.
[224]PETSC ERROR: to get more information on the crash.
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 288 in communicator MPI_COMM_WORLD
with errorcode 59.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[160]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
[224]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
[gadi-cpu-clx-0154.gadi.nci.org.au:06772] PMIX ERROR: UNREACHABLE in file /jobfs/53639599.gadi-pbs/0/openmpi/4.1.4/source/openmpi-4.1.4/opal/mca/pmix/pmix3x/pmix/src/server/pmix_server.c at line 2198
[800]PETSC ERROR: ------------------------------------------------------------------------
[800]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[800]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[800]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[800]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[800]PETSC ERROR: to get more information on the crash.
[800]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
[gadi-cpu-clx-0154.gadi.nci.org.au:06772] 5 more processes have sent help message help-mpi-api.txt / mpi-abort
[gadi-cpu-clx-0154.gadi.nci.org.au:06772] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

I will recompile with --with-debugging=yes and try to see what is happening.

gthyagi commented 8 months ago

I recompiled with --with-debugging=yes then the following error occurs. Also attached logfile.

[78]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[78]PETSC ERROR: Object is in wrong state
[78]PETSC ERROR: Difference in cached 2 norms: local 71184.9

jobscript_m18.sh.e111886044.txt

gthyagi commented 7 months ago

fixed in petsc3.21.0