BerkeleyLab / caffeine

A parallel runtime library for Fortran compilers
https://berkeleylab.github.io/caffeine/
Other
40 stars 7 forks source link

Add Caffeine's own custom assert subroutine #76

Closed rouson closed 6 months ago

bonachea commented 7 months ago

As a motivating example, below is example diagnostic output from an assertion failure provided by the UPC++ assertion facility, with GASNet's auto-backtrace support enabled.

The caller's invocation is simply the following line: (where the variable x actually has the value 3)

UPCXX_ASSERT(x > 10);

The resulting diagnostic failure output:

*** FATAL ERROR (proc 0): 
//////////////////////////////////////////////////////////////////////
UPC++ assertion failure:
 on process 0 (pcp-d-5)
 at assert2.cpp:5
 in function: void bar(int)

Failed condition: x > 10

To have UPC++ freeze during these errors so you can attach a debugger,
rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment.
//////////////////////////////////////////////////////////////////////

*** Details for bug reporting (proc 0): config=RELEASE=2023.9.0,SPEC=1.20,PTR=64bit,debug,SEQ,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native compiler=GNU/13.2.0 sys=x86_64-pc-linux-gnu
[0] Invoking GDB for backtrace...
[0] /usr/local/pkg/gdb/newest/bin/gdb -nx -batch -x /tmp/gasnet_tebx6e '/home/pcp1/bonachea/UPC/code/a.out' 26879
[0] [Thread debugging using libthread_db enabled]
[0] Using host libthread_db library "/usr/lib64/libthread_db.so.1".
[0] 0x00007f81b75a0a3c in waitpid () from /usr/lib64/libc.so.6
[0] #0  0x00007f81b75a0a3c in waitpid () from /usr/lib64/libc.so.6
[0] #1  0x00007f81b751ede2 in do_system () from /usr/lib64/libc.so.6
[0] #2  0x00000000004f4a92 in gasneti_system_redirected (cmd=0xaa0de0 <cmd> "/usr/local/pkg/gdb/newest/bin/gdb -nx -batch -x /tmp/gasnet_tebx6e '/home/pcp1/bonachea/UPC/code/a.out' 26879", stdout_fd=3) at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:1679
[0] #3  0x00000000004f5254 in gasneti_bt_gdb (fd=3) at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:1944
[0] #4  0x00000000004f5a15 in gasneti_print_backtrace (fd=2) at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:2228
[0] #5  0x00000000004f5fe9 in _gasneti_print_backtrace_ifenabled (fd=2) at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:2359
[0] #6  0x00000000004f3415 in gasneti_error_abort () at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:807
[0] #7  0x00000000004f3686 in gasneti_fatalerror_nopos (msg=0x7987cd "\n%s") at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/bld/GASNet-stable/gasnet_tools.c:851
[0] #8  0x00000000004734bb in upcxx::detail::fatal_error (msg=0xcb5530 "Failed condition: x > 10", title=0x7987d1 "assertion failure", func=0x77cfcc "void bar(int)", file=0x77cfc0 "assert2.cpp", line=5) at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/src/./diagnostic.cpp:63
[0] #9  0x00000000004734f1 in upcxx::detail::assert_failed (func=0x77cfcc "void bar(int)", file=0x77cfc0 "assert2.cpp", line=5, msg=0xcb5530 "Failed condition: x > 10") at /tmp/upcxx-nightly-2024.03.13-dirac-gcc/bld/upcxx_install/berkeleylab-upcxx-develop/src/./diagnostic.cpp:76
[0] #10 0x00000000004058aa in upcxx::detail::assert_failed (func=0x77cfcc "void bar(int)", file=0x77cfc0 "assert2.cpp", line=5, str=...) at /usr/local/pkg/upcxx-dirac/gcc-13.2.0/nightly-2024.03.13/include/upcxx/diagnostic.hpp:22
[0] #11 0x00000000004057cc in bar (x=3) at assert2.cpp:5
[0] #12 0x0000000000405844 in foo (x=3) at assert2.cpp:9
[0] #13 0x0000000000405865 in main () at assert2.cpp:16
[0] [Inferior 1 (process 26879) detached]
*** Caught a fatal signal (proc 0): SIGABRT(6)
*** Details for bug reporting (proc 0): config=RELEASE=2023.9.0,SPEC=1.20,PTR=64bit,debug,SEQ,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native compiler=GNU/13.2.0 sys=x86_64-pc-linux-gnu
Abort
rouson commented 7 months ago

@bonachea these are great suggestions and I'm supportive of the approach you're recommending. @ktras and I just discussed this PR and agreed that a live discussion will help with deciding a path forward so I'll respond in more detail during our next call. For now, here are a few things to consider:

  1. Strictly speaking, we're not guaranteed that every Fortran compiler invokes the C preprocessor. In the case of NAG, for example, the relevant flag is named -fpp for Fortran pre-preprocessor. I don't know what, if any differences there are between NAG's Fortran pre-processor and the C pre-processor.
  2. There is an upcoming Fortran preprocessor in the works so we might need to keep an eye on that for compatibility with any solution we choose.
  3. Will the proposed approach work on Windows? I ask because of your mention of POSIX. Maybe this is a moot point though because Caffeine uses GASNet-EX, which understandably doesn't support Windows. I'm sure the LLVM flang user community will eventually want support for Windows, but we could defer that to some other non-Caffeine solution that someone else implements.