Algebraic-Programming / LPF

A minimal communication layer for the implementation of immortal algorithms and for facilitating their broad use.
Apache License 2.0
5 stars 1 forks source link

Segmentation fault with `lpf_collectives_init` #12

Open alberto-scolari opened 1 year ago

alberto-scolari commented 1 year ago

The ALP code here

https://github.com/Algebraic-Programming/ALP/blob/b50fe72e957ed4da10c1cd1c59d924260f051a8d/include/graphblas/bsp1d/exec.hpp#L184

immediately segfaults because of the sizeof( size_t ) value passed as max_byte_size; the stack trace is

Program received signal SIGSEGV, Segmentation fault.
0x000014cd06466274 in lpf::MessageSort::addRegister(unsigned long, char*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
(gdb) bt
#0  0x000014cd06466274 in lpf::MessageSort::addRegister(unsigned long, char*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#1  0x000014cd06445f8f in lpf::MessageQueue::addGlobalReg(void*, unsigned long) () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#2  0x000014cd0645b136 in lpf_register_global () from /home/user/Projects/install/lpf/lib/liblpf_core_univ_mpimsg_Release.so
#3  0x000055c08771e2e3 in lpf_collectives_init ()
#4  0x000055c08771a0ad in _grb_exec_varin_spmd<output, true> (ctx=0x7ffca3145950, s=0, P=2, args=...) at /home/user/Projects/graphblas_fix_bsp1d_exec/include/graphblas/bsp1d/exec.hpp:180

The solution (so far) is to set max_byte_size to 0, as in

https://github.com/Algebraic-Programming/ALP/blob/b50fe72e957ed4da10c1cd1c59d924260f051a8d/include/graphblas/bsp1d/exec.hpp#L67

I don't know whether this problem depends on a specific combination of the function parameters or is simply a bug. In the first case, anyway, no error is raised or error code is returned; the function indeed segfaults during its own execution.

anyzelman commented 1 year ago

In which situation does it segfault exactly? (Existing ALP smoke tests seem to test for this use of exec, but don't segfault?)