p4lang / p4c

P4_16 reference compiler
https://p4.org/
Apache License 2.0
682 stars 446 forks source link

irgenerator crash during build #3989

Open outpaddling opened 1 year ago

outpaddling commented 1 year ago

This is not a high priority, as I'm just tinkering with the idea of creating a FreeBSD port out of sheer curiosity. A colleague is working with P4 so I decided to check out the build process. Reporting this in the hopes that it might help you improve the code. If you have any ideas what might be causing the seg fault, I'd be happy to test fixes. I tried building with gcc12 instead of clang, and tweaked optimization levels as a shot in the dark. No difference.

Tail of the build output.

[ 13% 49/353] : && /usr/bin/c++ -pipe -g -fstack-protector-strong -fno-strict-aliasing -fdiagnostics-color=never  -Wall -Wextra -Wno-overloaded-virtual -Wno-deprecated -Wno-deprecated-declarations -pedantic -Wno-gnu-zero-variadic-macro-arguments -pipe -g -fstack-protector-strong -fno-strict-aliasing  -Wall -Wextra -Wno-overloaded-virtual -Wno-deprecated -Wno-deprecated-declarations -pedantic -Wno-gnu-zero-variadic-macro-arguments -lexecinfo -fstack-protector-strong -fuse-ld=lld tools/ir-generator/CMakeFiles/irgenerator.dir/generator.cpp.o tools/ir-generator/CMakeFiles/irgenerator.dir/irclass.cpp.o tools/ir-generator/CMakeFiles/irgenerator.dir/methods.cpp.o tools/ir-generator/CMakeFiles/irgenerator.dir/type.cpp.o tools/ir-generator/CMakeFiles/irgenerator.dir/ir-generator.cpp.o -o tools/ir-generator/irgenerator  -Wl,-rpath,/usr/local/lib  lib/libp4ctoolkit.a  -pthread  /usr/local/lib/libboost_iostreams.so  /usr/local/lib/libboost_regex.so  /usr/local/lib/libgc.so  /usr/local/lib/libgccpp.so  -lrt  -pthread && :
[ 14% 50/353] cd /usr/ports/wip/p4c/work/.build && /usr/ports/wip/p4c/work/.build/tools/ir-generator/irgenerator -i ir/ir-generated.cpp.tmp -o ir/ir-generated.h.tmp -t ir/gen-tree-macro.h.tmp /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/base.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/type.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/expression.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/ir.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/v1.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/backends/dpdk/dpdk.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/frontends/p4-14/ir-v1.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/backends/bmv2/bmv2.def && awk -v name=ir-generated.cpp -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/ir-generated.cpp.tmp > ir/ir-generated.cpp.fixup && /usr/local/bin/cmake -E copy_if_different ir/ir-generated.cpp.fixup ir/ir-generated.cpp && awk -v name=ir-generated.h -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/ir-generated.h.tmp > ir/ir-generated.h.fixup && /usr/local/bin/cmake -E copy_if_different ir/ir-generated.h.fixup ir/ir-generated.h && awk -v name=gen-tree-macro.h -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/gen-tree-macro.h.tmp > ir/gen-tree-macro.h.fixup && /usr/local/bin/cmake -E copy_if_different ir/gen-tree-macro.h.fixup ir/gen-tree-macro.h
FAILED: ir/ir-generated.h ir/ir-generated.cpp ir/gen-tree-macro.h /usr/ports/wip/p4c/work/.build/ir/ir-generated.h /usr/ports/wip/p4c/work/.build/ir/ir-generated.cpp /usr/ports/wip/p4c/work/.build/ir/gen-tree-macro.h 
cd /usr/ports/wip/p4c/work/.build && /usr/ports/wip/p4c/work/.build/tools/ir-generator/irgenerator -i ir/ir-generated.cpp.tmp -o ir/ir-generated.h.tmp -t ir/gen-tree-macro.h.tmp /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/base.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/type.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/expression.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/ir.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/ir/v1.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/backends/dpdk/dpdk.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/frontends/p4-14/ir-v1.def /usr/ports/wip/p4c/work/p4c-1.2.3.8/backends/bmv2/bmv2.def && awk -v name=ir-generated.cpp -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/ir-generated.cpp.tmp > ir/ir-generated.cpp.fixup && /usr/local/bin/cmake -E copy_if_different ir/ir-generated.cpp.fixup ir/ir-generated.cpp && awk -v name=ir-generated.h -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/ir-generated.h.tmp > ir/ir-generated.h.fixup && /usr/local/bin/cmake -E copy_if_different ir/ir-generated.h.fixup ir/ir-generated.h && awk -v name=gen-tree-macro.h -f /usr/ports/wip/p4c/work/.build/irgen-fixup.awk ir/gen-tree-macro.h.tmp > ir/gen-tree-macro.h.fixup && /usr/local/bin/cmake -E copy_if_different ir/gen-tree-macro.h.fixup ir/gen-tree-macro.h
Segmentation fault (core dumped)
ninja: build stopped: subcommand failed.
===> Compilation failed unexpectedly.
Try to set MAKE_JOBS_UNSAFE=yes and rebuild before reporting the failure to
the maintainer.
*** Error code 1

Backtrace info:

(lldb) bt
* thread #1, name = 'irgenerator', stop reason = signal SIGSEGV
  * frame #0: 0x000000082848e294 libthr.so.3`___lldb_unnamed_symbol549 + 4
    frame #1: 0x000000082849acbb libthr.so.3`___lldb_unnamed_symbol710 + 139
    frame #2: 0x00000008291f0abd libc.so.7`open + 205
    frame #3: 0x0000000823415fb4 libgc.so.1`GC_unix_get_mem + 212
    frame #4: 0x000000082340e743 libgc.so.1`GC_init_headers + 83
    frame #5: 0x0000000823413f26 libgc.so.1`GC_init + 982
    frame #6: 0x00000000002dc128 irgenerator`::realloc(ptr=0x0000000000000000, size=1664) at gc.cpp:142:13
    frame #7: 0x00000000002dc367 irgenerator`::malloc(size=1664) at gc.cpp:161:28                                                                               
    frame #8: 0x00000000002dc3f6 irgenerator`::calloc(size=1664, elsize=1664) at gc.cpp:170:16
    frame #9: 0x00000008284922d9 libthr.so.3`___lldb_unnamed_symbol584 + 409
    frame #10: 0x000000082849139c libthr.so.3`___lldb_unnamed_symbol574 + 780
    frame #11: 0x00000774c5a340ad ld-elf.so.1
    frame #12: 0x00000774c5a326ab ld-elf.so.1
fruffy commented 1 year ago

Does this happen when you compile a fresh build with the cmake option -DENABLE_GC=OFF ? If the crash goes away, it might be caused by the GC library. It has problems with non-standard systems.

outpaddling commented 1 year ago

Disabling GC does get around the issue, thanks. What are the consequences of running without garbage collection, if any? Beyond that, getting a successful build was only a matter of two files missing sys/wait.h.

--- frontends/common/parser_options.cpp.orig    2023-04-25 13:01:16 UTC
+++ frontends/common/parser_options.cpp
@@ -21,6 +21,7 @@ limitations under the License.

 #include <sys/stat.h>
 #include <sys/types.h>
+#include <sys/wait.h>

 #include <regex>
 #include <unordered_set>
--- test/gtest/helpers.cpp.orig 2023-04-25 13:28:10 UTC
+++ test/gtest/helpers.cpp
@@ -19,6 +19,8 @@ limitations under the License.
 #include <sstream>
 #include <stdexcept>

+#include <sys/wait.h>
+
 #include "helpers.h"

 #include "frontends/common/applyOptionsPragmas.h"

My work-in-progress FreeBSD port is here in case someone is interested:

https://github.com/outpaddling/freebsd-ports-wip/tree/master/p4c

fruffy commented 1 year ago

Disabling GC does get around the issue, thanks. What are the consequences of running without garbage collection, if any?

Well.. for larger programs you may run out of memory because all heap allocation is managed by the garbage collector. This compiler does not use any smart pointers. I believe the crash is fixable, but the garbage collection library (https://github.com/ivmai/bdwgc) is a bit finicky.

Beyond that, getting a successful build was only a matter of two files missing sys/wait.h.

Thanks for doing this! Could you file a PR for the includes? I can approve/merge it then.

We could add a weekly/nightly FreeBSD CI workflow to ensure that the compiler does compile and can compile programs. However, we have limited development resources and I am not sure if there is enough incentive to continuously fix issues on FreeBSD. We barely maintain our MacOS/Fedora builds.

davidbolvansky commented 1 year ago

Master branch? Which version of Clang?

Clang 12 should work (see quite related https://github.com/p4lang/p4c/pull/3586)

Check also https://github.com/p4lang/p4c/pull/3586#issuecomment-1284594818

outpaddling commented 1 year ago

Master branch? Which version of Clang?

Clang 12 should work (see quite related #3586)

Check also #3586 (comment)

Thanks, but as I mentioned in the beginning, I get the same error with gcc, though boehm-gc is built separately using clang. The port uses the latest release of p4c. Clang version 13.0.0 FTR.

outpaddling commented 1 year ago

Disabling GC does get around the issue, thanks. What are the consequences of running without garbage collection, if any?

Well.. for larger programs you may run out of memory because all heap allocation is managed by the garbage collector. This compiler does not use any smart pointers. I believe the crash is fixable, but the garbage collection library (https://github.com/ivmai/bdwgc) is a bit finicky.

Beyond that, getting a successful build was only a matter of two files missing sys/wait.h.

Thanks for doing this! Could you file a PR for the includes? I can approve/merge it then.

We could add a weekly/nightly FreeBSD CI workflow to ensure that the compiler does compile and can compile programs. However, we have limited development resources and I am not sure if there is enough incentive to continuously fix issues on FreeBSD. We barely maintain our MacOS/Fedora builds.

Sure, I can do a PR.

No need for CI workflows on my account. As I mentioned, I'm only satisfying my curiosity since a colleague is learning P4 and came to me with a question. ( She was having trouble with vagrant to install the tutorial VM on her Mac. ) I just decided to see what it would take to run it on bare metal on my FreeBSD server. Installing the VM works fine using vagrant and virtualbox 6.x on my FreeBSD machine, BTW.

I'll sleep on this issue for a while and maybe ping the FreeBSD developers. A LOT of them work in the networking industry, so I suspect someone will take an interest.

davidbolvansky commented 1 year ago

-DCMAKE_BUILD_TYPE=Debug works?

outpaddling commented 1 year ago

Rebuilding boehm-gc with -g to get more out of the backtrace:

(lldb) bt
* thread #1, name = 'irgenerator', stop reason = signal SIGSEGV
  * frame #0: 0x0000000827753294 libthr.so.3`___lldb_unnamed_symbol549 + 4
    frame #1: 0x000000082775fcbb libthr.so.3`___lldb_unnamed_symbol710 + 139
    frame #2: 0x000000082822eabd libc.so.7`open + 205
    frame #3: 0x0000000823e7924d libgc.so.1`GC_unix_mmap_get_mem(bytes=65536) at os_dep.c:2228:21
    frame #4: 0x0000000823e79215 libgc.so.1`GC_unix_get_mem(bytes=65536) at os_dep.c:2277:12
    frame #5: 0x0000000823e6ead5 libgc.so.1`GC_scratch_alloc(bytes=8224) at headers.c:138:25
    frame #6: 0x0000000823e6ec32 libgc.so.1`GC_init_headers at headers.c:195:35
    frame #7: 0x0000000823e76c4c libgc.so.1`GC_init at misc.c:1252:5
    frame #8: 0x00000000002dc128 irgenerator`::realloc(ptr=0x0000000000000000, size=1664) at gc.cpp:142:13
    frame #9: 0x00000000002dc367 irgenerator`::malloc(size=1664) at gc.cpp:161:28                                                                               
    frame #10: 0x00000000002dc3f6 irgenerator`::calloc(size=1664, elsize=1664) at gc.cpp:170:16
    frame #11: 0x00000008277572d9 libthr.so.3`___lldb_unnamed_symbol584 + 409
    frame #12: 0x000000082775639c libthr.so.3`___lldb_unnamed_symbol574 + 780
    frame #13: 0x00003d28461bc0ad ld-elf.so.1
    frame #14: 0x00003d28461ba6ab ld-elf.so.1

os_dep.c:2228 is this:

          zero_fd = open("/dev/zero", O_RDONLY);

zero_fd is a global defined as

static int zero_fd = -1;

Very peculiar to get a seg fault on a function call with two constant arguments and an assignment to a simple int variable. This makes me think the core is being corrupted somehow before the program arrives here.

This is a stable boehm-gc port (8.2.2) that is used by dozens of other ports, so I doubt it's a boehm-gc bug.

davidbolvansky commented 1 year ago

New boehm GC versions are somehow problematic for p4c, @fruffy reported some issues recently. Maybe related? Could you downgrade it?

fruffy commented 1 year ago

New boehm GC versions are somehow problematic for p4c, @fruffy reported some issues recently. Maybe related? Could you downgrade it?

It's likely an issue with the threading implementation in libgc. It needs to be appropriately configured for the distribution. We have problems with newer versions of z3, but libgc is stable.

I need to push https://github.com/p4lang/p4c/pull/3930 forward. It could help with some of these issues, but there are still some blockers.

outpaddling commented 1 year ago

Downgrading boehm-gc isn't really an option. What's in the FreeBSD ports tree will only go up from here, and I would not be the one to add and maintain a legacy port, given that I'm not even a P4 user. Legacy ports are frowned on in favor of fixing the root problem anyway.