Open JessicaMeixner-NOAA opened 1 year ago
Hi All,
there is a new SCOTCH version, who will update it? I think on the initial commit to develop we should use the latest.
Cheers
Aron
Von: Jessica Meixner @.> Gesendet: Dienstag, 14. Februar 2023 23:39 An: NOAA-EMC/WW3 @.> Cc: Subscribed @.***> Betreff: [NOAA-EMC/WW3] Pthreads + SCOTCH (Issue #891)
When building SCOTCH with Pthreads on orion, the model will build but then WW3 will fail (see details here: #885 https://github.com/NOAA-EMC/WW3/issues/885 ). Turning off pthreads solves this issue and moving forward for now, we're not using pthreads, but this issue is to keep track of this problem and to eventually return to see if we can turn pthreads back on.
There's not expectation that this will be resolved soon and this issue is just made for tracking purposes.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WW3/issues/891 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2S43SRFADOQ3CLQRZBKRTWXQCPHANCNFSM6AAAAAAU4EFENY . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/AB2S43V5YD7TOE3JKUON6BTWXQCPHA5CNFSM6AAAAAAU4EFEN2WGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHF46BF5U.gif Message ID: @. @.> >
Hi @JessicaMeixner-NOAA, @MatthewMasarik-NOAA
it came to my mind, that there is a possibility that your environmental settings of the HPCF different in a certain way, did you compared ulimit -a on the both machines and there are quite more settings linked to mpi and maybe even the threading behavior. Maybe, u can adjust with @thesser1 ...
Hi @aronroland,
it came to my mind, that there is a possibility that your environmental settings of the HPCF different in a certain way, did you compared ulimit -a on the both machines and there are quite more settings linked to mpi and maybe even the threading behavior. Maybe, u can adjust with @thesser1 ...
I definitely agree. @JessicaMeixner-NOAA and I have previously tagged @aliabdolali and @thesser1 regarding their job card settings. Ali showed me ccmake
which I'm currently using to verify the cmake
flags passed. Any environment settings you guys found useful would be great help to compare against our job cards.
I am running with the following ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 513915
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 16384
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
On Fri, Feb 17, 2023 at 8:11 AM Matthew Masarik @.***> wrote:
Hi @aronroland https://github.com/aronroland,
it came to my mind, that there is a possibility that your environmental settings of the HPCF different in a certain way, did you compared ulimit -a on the both machines and there are quite more settings linked to mpi and maybe even the threading behavior. Maybe, u can adjust with @thesser1 https://github.com/thesser1 ...
I definitely agree. @JessicaMeixner-NOAA https://github.com/JessicaMeixner-NOAA and I have previously tagged @aliabdolali https://github.com/aliabdolali and @thesser1 https://github.com/thesser1 regarding their job card settings. Ali showed me ccmake which I'm currently using to verify the cmake flags passed. Any environment settings you guys found useful would be great help to compare against our job cards.
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WW3/issues/891#issuecomment-1434632638, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAU2O3ETAAKKWNHHP75T3GLWX52HTANCNFSM6AAAAAAU4EFENY . You are receiving this because you were mentioned.Message ID: @.***>
Thank you @thesser1! This is very helpful. The ulimit -a
you and @aronroland mentioned is new to me, I'll include that in our job card.
here is the ccmake
output for a SCOTCH build
Are you guys setting INTSIZE
?
BUILD_LIBESMUMPS ON
BUILD_LIBSCOTCHMETIS ON
BUILD_PTSCOTCH ON
CMAKE_BUILD_TYPE Release
CMAKE_INSTALL_PREFIX /p/work2/thesser1/code_management/tools/scotch_test/install/scotch-v7.0.3
INCLUDE_INSTALL_DIR include/
INSTALL_METIS_HEADERS ON
INTSIZE
LIBRARY_INSTALL_DIR lib/
MPI_THREAD_MULTIPLE ON
THREADS ON
USE_BZ2 ON
USE_LZMA ON
USE_ZLIB ON
On Fri, Feb 17, 2023 at 8:56 AM Matthew Masarik @.***> wrote:
Thanks you @thesser1! This is very helpful. The ulimit -a you and @aronroland mentioned is new to me, I'll include that in our job card.
here is the ccmake output for a SCOTCH build
Are you guys setting INTSIZE?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
but it looks like it looks like threads are on with mine. This is the build straight from the noaa scotch build shell from the other day.
On Fri, Feb 17, 2023 at 8:58 AM Ty Hesser @.***> wrote:
BUILD_LIBESMUMPS ON
BUILD_LIBSCOTCHMETIS ON
BUILD_PTSCOTCH ON
CMAKE_BUILD_TYPE Release
CMAKE_INSTALL_PREFIX /p/work2/thesser1/code_management/tools/scotch_test/install/scotch-v7.0.3
INCLUDE_INSTALL_DIR include/
INSTALL_METIS_HEADERS ON
INTSIZE
LIBRARY_INSTALL_DIR lib/
MPI_THREAD_MULTIPLE ON
THREADS ON
USE_BZ2 ON
USE_LZMA ON
USE_ZLIB ON
On Fri, Feb 17, 2023 at 8:56 AM Matthew Masarik @.***> wrote:
Thanks you @thesser1! This is very helpful. The ulimit -a you and @aronroland mentioned is new to me, I'll include that in our job card.
here is the ccmake output for a SCOTCH build
Are you guys setting INTSIZE?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
yes, that's very interesting. i thought your threads would be off. I found supplying those flags to the cmake call was not actually overriding the settings in the scotch/CMakeLists.txt, and I had to edit that file to get the threads to turn off. I've got a lot to follow up on. greatly appreciate it, Ty
from scotch manual, might be helpful:
to create distributed graphs in parallel. Since this task involves
concurrent MPI communications, the MPI library must support the
MPI_THREAD_MULTIPLE level. In order to take advantage of these
features, the "-DSCOTCH_PTHREAD_MPI" flag must be set, in addition
to the "-DSCOTCH_PTHREAD" flag. These two flags are completely
independent from the "-DCOMMON_PTHREAD_FILE" flag, which can be
set independently from the others.
Note that if you compile Scotch with the "-DSCOTCH_PTHREAD_MPI"
flag, you will have to initialize your communication subsystem by
using the MPI_Init_thread() MPI call instead of MPI_Init(), and
the provided thread support level value returned by the routine
must be checked carefully to assert it is indeed
MPI_THREAD_MULTIPLE.
Note also that since PT-Scotch calls Scotch routines when
operating on a single process, setting "-DSCOTCH_PTHREAD" but not
"-DSCOTCH_PTHREAD_MPI" will still allow multiple threads to be
used on each MPI process, without interfering with MPI itself. In
this case, the MPI thread level MPI_THREAD_FUNNELED will be
sufficient.
The compilation flags used to manage threads are the following:
- "-DSCOTCH_PTHREAD" is mandatory to enable multi-threaded
algorithms in Scotch and/or PT-Scotch. It has to be used in
conjunction with the "-DCOMMON_PTHREAD" flag that enables thread
management at the lower levels of the Scotch implementation.
- "-DSCOTCH_PTHREAD_MPI" enables some algorithms of PT-Scotch that
may make concurrent calls to the MPI communication subsystem. It
has to be used in conjunction with the "-DCOMMON_PTHREAD" flag
(hence also with the "-DCOMMON_PTHREAD" flag). Alternately, the
compilation flag "-DSCOTCH_MPI_ASYNC_COLL" can be used to replace
threaded synchronous communication routines by non-threaded
asynchronous communication routines.
Thanks @aliabdolali, i've have been following the ptscotch user manual closely
@thesser1, @aliabdolali, or @aronroland are you guys compiling with Intel? if so, do you know if its Intel64?
Yes it is intel, and yes it is intel64
On Fri, Feb 17, 2023 at 12:49 PM Matthew Masarik @.***> wrote:
@thesser1 https://github.com/thesser1, @aliabdolali https://github.com/aliabdolali, or @aronroland https://github.com/aronroland are you guys compiling with Intel? if so, do you know if its Intel64?
— Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/WW3/issues/891#issuecomment-1435025207, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAU2O3EAA5WJUYAYHNI5JQDWX622FANCNFSM6AAAAAAU4EFENY . You are receiving this because you were mentioned.Message ID: @.***>
okay. thanks Ty
When building SCOTCH with Pthreads on orion, the model will build but then WW3 will fail (see details here: #885 ). Turning off pthreads solves this issue and moving forward for now, we're not using pthreads, but this issue is to keep track of this problem and to eventually return to see if we can turn pthreads back on.
There's not expectation that this will be resolved soon and this issue is just made for tracking purposes.