Open BenWibking opened 3 months ago
Did you configure hypre with –enable-mixed-int? If not, the problem will be too big for 32bit integers.
From: Ben Wibking @.> Sent: Thursday, July 25, 2024 12:40 PM To: LLNL/AMG2023 @.> Cc: Subscribed @.***> Subject: [LLNL/AMG2023] segmentation fault with >= 64 nodes on Frontier (Issue #13)
I can run problem 1 successfully on Frontier with < 64 nodes fine, but I get a segmentation fault with >= 64 nodes:
Running with these driver parameters:
Problem ID = 1
=============================================
Hypre init times:
=============================================
Hypre init:
wall clock time = 0.000006 seconds
Laplacian_27pt:
(Nx, Ny, Nz) = (1600, 1600, 1600)
(Px, Py, Pz) = (8, 8, 8)
srun: error: frontier04522: tasks 282-287: Segmentation fault
srun: Terminating StepId=2131722.0
with Segmentation fault errors reported for all of the other MPI ranks as well.
I built Hypre v2.31.0 with:
./configure --with-hip --with-gpu-arch=gfx90a --with-MPI-lib-dirs="${MPICH_DIR}/lib" --with-MPI-libs="mpi" --with-MPI-include="${MPICH_DIR}/include" --enable-mixedint
with cce/17.0.0, rocm/5.7.1, and cray-mpich/8.1.28.
I'm running the problem with:
srun ./amg -problem 1 -n 200 200 200 -P 8 8 8
— Reply to this email directly, view it on GitHubhttps://urldefense.us/v2/url?u=https-3A__github.com_LLNL_AMG2023_issues_13&d=DwMCaQ&c=pKoAVQro6qDbLoK0T8588B4mZJhJhC4e6QXJy0XnJec&r=TQu1MQ9CDqka0jKA8Y4yHQ&m=yNZNWu0YJB5PPTPtaz-_IkP8lcCfY7AmFAbImSyujAltOquECbednzEn8llIN_ey&s=gG618ULU0Ja9oUzUDY4VKndS0N2NvOzxqtcCzlJMxso&e=, or unsubscribehttps://urldefense.us/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AD4NLLJOQ2N4G2RRTIQ2QYTZOFH75AVCNFSM6AAAAABLPGCFPWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGQZTAOBVHA4DSOA&d=DwMCaQ&c=pKoAVQro6qDbLoK0T8588B4mZJhJhC4e6QXJy0XnJec&r=TQu1MQ9CDqka0jKA8Y4yHQ&m=yNZNWu0YJB5PPTPtaz-_IkP8lcCfY7AmFAbImSyujAltOquECbednzEn8llIN_ey&s=RgNPpenuM-oRzlGw5IVKQzonWdUNQ5orhg5ZBdBLfEY&e=. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>
I configured it with --enable-mixedint
:
./configure --with-hip --with-gpu-arch=gfx90a --with-MPI-lib-dirs="${MPICH_DIR}/lib" --with-MPI-libs="mpi" --with-MPI-include="${MPICH_DIR}/include" --enable-mixedint
Thank you for reporting this issue. I will take a look and get back to you soon.
Is there an update on this? I am still seeing this issue on Frontier.
I can run problem 1 successfully on Frontier with < 64 nodes fine, but I get a segmentation fault with >= 64 nodes:
with
Segmentation fault
errors reported for all of the other MPI ranks as well.I built Hypre v2.31.0 with:
with cce/17.0.0, rocm/5.7.1, and cray-mpich/8.1.28.
I'm running the problem with: