daniestevez / gr-satellites

GNU Radio decoder for Amateur satellites
GNU General Public License v3.0
766 stars 160 forks source link

Phase unwrap block segfaults #414

Closed daniestevez closed 1 year ago

daniestevez commented 1 year ago

@ScottTilley has found that the Phase unwrap block segfaults in his system. He is using Ubuntu 22.04 with GNU Radio 3.10.4.0, Python 3.10.6 from the repository and gr-satellites v5.1.0 from the PPA.

At first, I haven't been able to replicate this segfault, nor to find any mistakes in the code. I've tried to set up a Docker image with the same software, and it works for me. The image is daniestevez/phase-unwrap-test, and Scott's test flowgraph can be found in the / directory. However, I've noticed that the GNU Radio in this container is version 3.10.1.1 (which is actually the version in Ubuntu's repository).

I've realized that Scott is using GNU Radio's version from the PPA instead of the repository. I've changed the GNU Radio version in my Docker container by that of the repository, and surely it segfaults.

So the problem appears to be that gr-satellites v5.1.0 from the PPA doesn't work with GNU Radio 3.10.4.0 from the PPA. The gr-satellites PPA packages are built against the GNU Radio version from the repo, not from the PPA. I'm guessing there is an ABI mismatch between GNU Radio 3.10.4.0 and 3.10.1.1. I'll play a bit with gdb to see if I narrow the problem.

daniestevez commented 1 year ago

The docker image with GNU Radio from the PPA is daniestevez/phase-unwrap-test-grppa.

daniestevez commented 1 year ago

Here is a backtrace from gdb:

root@test:/# gdb --args python3 test.py
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python3...
(No debugging symbols found in python3)
(gdb) run
Starting program: /usr/bin/python3 test.py
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Warning: failed to XInitThreads()
[New Thread 0x7f7405ac5640 (LWP 2121)]
[New Thread 0x7f74052c4640 (LWP 2122)]
[New Thread 0x7f7404ac3640 (LWP 2123)]
[New Thread 0x7f74042c2640 (LWP 2124)]
[New Thread 0x7f7403ac1640 (LWP 2125)]
[New Thread 0x7f74032c0640 (LWP 2126)]
[New Thread 0x7f7402abf640 (LWP 2127)]
[New Thread 0x7f74022be640 (LWP 2128)]
[New Thread 0x7f7401abd640 (LWP 2129)]
[New Thread 0x7f74012bc640 (LWP 2130)]
[New Thread 0x7f73fdddc640 (LWP 2131)]
[New Thread 0x7f73fd5db640 (LWP 2132)]
[New Thread 0x7f73ecdda640 (LWP 2133)]
[New Thread 0x7f73e45d9640 (LWP 2134)]
[New Thread 0x7f73dbdd8640 (LWP 2135)]
[New Thread 0x7f73db5d7640 (LWP 2136)]
[New Thread 0x7f73cadd6640 (LWP 2137)]
[New Thread 0x7f73c25d5640 (LWP 2138)]
[New Thread 0x7f73b9dd4640 (LWP 2139)]
[New Thread 0x7f73b15d3640 (LWP 2140)]
[New Thread 0x7f73b0dd2640 (LWP 2141)]
[New Thread 0x7f73a05d1640 (LWP 2142)]
[New Thread 0x7f7397dd0640 (LWP 2143)]
[New Thread 0x7f738f5cf640 (LWP 2144)]
[New Thread 0x7f7386dce640 (LWP 2145)]
[New Thread 0x7f737ae7d640 (LWP 2146)]
[New Thread 0x7f737a67c640 (LWP 2147)]
[New Thread 0x7f7379e7b640 (LWP 2148)]
[New Thread 0x7f737967a640 (LWP 2149)]
[New Thread 0x7f7378e79640 (LWP 2150)]
[New Thread 0x7f7378678640 (LWP 2151)]
[New Thread 0x7f7377e77640 (LWP 2152)]
[New Thread 0x7f7377676640 (LWP 2153)]
[New Thread 0x7f7376e75640 (LWP 2154)]
[New Thread 0x7f7376674640 (LWP 2155)]
[New Thread 0x7f73752d3640 (LWP 2156)]
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
[New Thread 0x7f73749f6640 (LWP 2157)]
[Thread 0x7f73db5d7640 (LWP 2136) exited]
[Thread 0x7f73b0dd2640 (LWP 2141) exited]
[Thread 0x7f73fd5db640 (LWP 2132) exited]
[Thread 0x7f73fdddc640 (LWP 2131) exited]
[Thread 0x7f73e45d9640 (LWP 2134) exited]
[Thread 0x7f73ecdda640 (LWP 2133) exited]
[Thread 0x7f73b9dd4640 (LWP 2139) exited]
[Thread 0x7f7386dce640 (LWP 2145) exited]
[Thread 0x7f738f5cf640 (LWP 2144) exited]
[Thread 0x7f7397dd0640 (LWP 2143) exited]
[Thread 0x7f73a05d1640 (LWP 2142) exited]
[Thread 0x7f73b15d3640 (LWP 2140) exited]
[Thread 0x7f73c25d5640 (LWP 2138) exited]
[Thread 0x7f73cadd6640 (LWP 2137) exited]
[Thread 0x7f73dbdd8640 (LWP 2135) exited]
[Detaching after fork from child process 2158]

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007f7407096fe5 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
(gdb) bt
#0  0x00007f7407096fe5 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#1  0x00007f7407142927 in ?? () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#2  0x00007f74070a293f in gr::block::allocate_detail(int, int, std::vector<int, std::allocator<int> > const&, std::vector<unsigned long, std::allocator<unsigned long> > const&, std::vector<unsigned int, std::allocator<unsigned int> > const&) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#3  0x00007f74070b728e in gr::flat_flowgraph::allocate_block_detail(std::shared_ptr<gr::basic_block>) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#4  0x00007f74070b8845 in gr::flat_flowgraph::setup_connections() () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#5  0x00007f74070ee40b in gr::top_block_impl::start(int) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#6  0x00007f74070ee7f6 in gr::top_block::start(int) () from /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.4
#7  0x00007f737d78ad91 in ?? () from /usr/lib/python3/dist-packages/gnuradio/gr/gr_python.cpython-310-x86_64-linux-gnu.so
#8  0x00007f737d790d43 in ?? () from /usr/lib/python3/dist-packages/gnuradio/gr/gr_python.cpython-310-x86_64-linux-gnu.so
#9  0x00007f737d71def3 in ?? () from /usr/lib/python3/dist-packages/gnuradio/gr/gr_python.cpython-310-x86_64-linux-gnu.so
#10 0x000055805c40731e in ?? ()
#11 0x000055805c3fde4b in _PyObject_MakeTpCall ()
#12 0x000055805c3f64f2 in _PyEval_EvalFrameDefault ()
#13 0x000055805c4155c1 in ?? ()
#14 0x000055805c3f6152 in _PyEval_EvalFrameDefault ()
#15 0x000055805c407b6c in _PyFunction_Vectorcall ()
#16 0x000055805c3f0675 in _PyEval_EvalFrameDefault ()
#17 0x000055805c3ecde6 in ?? ()
#18 0x000055805c4e2cb6 in PyEval_EvalCode ()
#19 0x000055805c50f748 in ?? ()
#20 0x000055805c50855b in ?? ()
#21 0x000055805c50f495 in ?? ()
#22 0x000055805c50e978 in _PyRun_SimpleFileObject ()
#23 0x000055805c50e673 in _PyRun_AnyFileObject ()
#24 0x000055805c4ffb7e in Py_RunMain ()
#25 0x000055805c4d5bbd in Py_BytesMain ()
#26 0x00007f740d50cd90 in __libc_start_call_main (main=main@entry=0x55805c4d5b80, argc=argc@entry=2, argv=argv@entry=0x7ffda960d2b8) at ../sysdeps/nptl/libc_start_call_main.h:58
#27 0x00007f740d50ce40 in __libc_start_main_impl (main=0x55805c4d5b80, argc=2, argv=0x7ffda960d2b8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffda960d2a8) at ../csu/libc-start.c:392
#28 0x000055805c4d5ab5 in _start ()
(gdb) 
daniestevez commented 1 year ago

As expected, libgnuradio-satellites.so is linked against the GNU Radio 3.10.1 libraries:

root@test:/# ldd /usr/lib/x86_64-linux-gnu/libgnuradio-satellites.so
    linux-vdso.so.1 (0x00007fffca7e4000)
    libgnuradio-blocks.so.3.10.1 => /lib/x86_64-linux-gnu/libgnuradio-blocks.so.3.10.1 (0x00007fc477488000)
    libgnuradio-runtime.so.3.10.1 => /lib/x86_64-linux-gnu/libgnuradio-runtime.so.3.10.1 (0x00007fc477315000)
    libgnuradio-pmt.so.3.10.1 => /lib/x86_64-linux-gnu/libgnuradio-pmt.so.3.10.1 (0x00007fc4772ba000)
    libspdlog.so.1 => /lib/x86_64-linux-gnu/libspdlog.so.1 (0x00007fc47723f000)
    libfmt.so.8 => /lib/x86_64-linux-gnu/libfmt.so.8 (0x00007fc47721e000)
    libvolk.so.2.5 => /lib/x86_64-linux-gnu/libvolk.so.2.5 (0x00007fc476f57000)
    libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fc476d2d000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc476c46000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fc476c26000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc4769fe000)
    libboost_thread.so.1.74.0 => /lib/x86_64-linux-gnu/libboost_thread.so.1.74.0 (0x00007fc4769dc000)
    libsndfile.so.1 => /lib/x86_64-linux-gnu/libsndfile.so.1 (0x00007fc47695b000)
    libboost_program_options.so.1.74.0 => /lib/x86_64-linux-gnu/libboost_program_options.so.1.74.0 (0x00007fc476916000)
    libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007fc476894000)
    libunwind.so.8 => /lib/x86_64-linux-gnu/libunwind.so.8 (0x00007fc476879000)
    libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007fc47629f000)
    libthrift-0.16.0.so => /lib/x86_64-linux-gnu/libthrift-0.16.0.so (0x00007fc476206000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fc4776eb000)
    liborc-0.4.so.0 => /lib/x86_64-linux-gnu/liborc-0.4.so.0 (0x00007fc47617f000)
    libFLAC.so.8 => /lib/x86_64-linux-gnu/libFLAC.so.8 (0x00007fc476143000)
    libvorbis.so.0 => /lib/x86_64-linux-gnu/libvorbis.so.0 (0x00007fc476116000)
    libvorbisenc.so.2 => /lib/x86_64-linux-gnu/libvorbisenc.so.2 (0x00007fc47606b000)
    libopus.so.0 => /lib/x86_64-linux-gnu/libopus.so.0 (0x00007fc47600d000)
    libogg.so.0 => /lib/x86_64-linux-gnu/libogg.so.0 (0x00007fc476000000)
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007fc475fd5000)
    libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007fc475fa4000)
    libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fc475f88000)
    libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007fc475ee4000)
    libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007fc475aa0000)
daniestevez commented 1 year ago

I've built a Debug build of GNU Radio 3.10.4.0, and it also segfaults. This is the backtrace:

grdgb root@test:/# gdb --args python3 test.py
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python3...
(No debugging symbols found in python3)
(gdb) run
Starting program: /usr/bin/python3 test.py
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7f5c0c7dc640 (LWP 27069)]
[New Thread 0x7f5c0bfdb640 (LWP 27070)]
[New Thread 0x7f5bfb7da640 (LWP 27071)]
[New Thread 0x7f5bf2fd9640 (LWP 27072)]
[New Thread 0x7f5bea7d8640 (LWP 27073)]
[New Thread 0x7f5be1fd7640 (LWP 27074)]
[New Thread 0x7f5bd97d6640 (LWP 27075)]
[New Thread 0x7f5bd0fd5640 (LWP 27076)]
[New Thread 0x7f5bc87d4640 (LWP 27077)]
[New Thread 0x7f5bbffd3640 (LWP 27078)]
[New Thread 0x7f5bbf7d2640 (LWP 27079)]
[New Thread 0x7f5baefd1640 (LWP 27080)]
[New Thread 0x7f5ba67d0640 (LWP 27081)]
[New Thread 0x7f5b9dfcf640 (LWP 27082)]
[New Thread 0x7f5b957ce640 (LWP 27083)]
[New Thread 0x7f5b88d93640 (LWP 27084)]
[New Thread 0x7f5b88592640 (LWP 27085)]
[New Thread 0x7f5b87d91640 (LWP 27086)]
[New Thread 0x7f5b87590640 (LWP 27087)]
[New Thread 0x7f5b86d8f640 (LWP 27088)]
[New Thread 0x7f5b8658e640 (LWP 27089)]
[New Thread 0x7f5b85d8d640 (LWP 27090)]
[New Thread 0x7f5b8558c640 (LWP 27091)]
[New Thread 0x7f5b84d8b640 (LWP 27092)]
[New Thread 0x7f5b8458a640 (LWP 27093)]
[New Thread 0x7f5b831ea640 (LWP 27094)]
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
[New Thread 0x7f5b82912640 (LWP 27095)]
[Thread 0x7f5c0bfdb640 (LWP 27070) exited]
[Thread 0x7f5bbf7d2640 (LWP 27079) exited]
[Thread 0x7f5c0c7dc640 (LWP 27069) exited]
[Thread 0x7f5bfb7da640 (LWP 27071) exited]
[Thread 0x7f5bf2fd9640 (LWP 27072) exited]
[Thread 0x7f5baefd1640 (LWP 27080) exited]
[Thread 0x7f5b957ce640 (LWP 27083) exited]
[Thread 0x7f5b9dfcf640 (LWP 27082) exited]
[Thread 0x7f5ba67d0640 (LWP 27081) exited]
[Thread 0x7f5bbffd3640 (LWP 27078) exited]
[Thread 0x7f5bc87d4640 (LWP 27077) exited]
[Thread 0x7f5bd0fd5640 (LWP 27076) exited]
[Thread 0x7f5bd97d6640 (LWP 27075) exited]
[Thread 0x7f5be1fd7640 (LWP 27074) exited]
[Thread 0x7f5bea7d8640 (LWP 27073) exited]
[Detaching after fork from child process 27096]

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007f5c107fcc15 in __gnu_cxx::__exchange_and_add (__val=-1, __mem=0x9) at /usr/include/c++/11/ext/atomicity.h:66
66    { return __atomic_fetch_add(__mem, __val, __ATOMIC_ACQ_REL); }
(gdb) bt
#0  0x00007f5c107fcc15 in __gnu_cxx::__exchange_and_add (__val=-1, __mem=0x9)
    at /usr/include/c++/11/ext/atomicity.h:66
#1  __gnu_cxx::__exchange_and_add_dispatch (__val=-1, __mem=0x9) at /usr/include/c++/11/ext/atomicity.h:101
#2  std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1)
    at /usr/include/c++/11/bits/shared_ptr_base.h:165
#3  0x00007f5c1053223b in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::operator= (__r=..., __r=..., 
    this=0x55c68ed41a70) at /usr/include/c++/11/bits/shared_ptr_base.h:724
#4  std::__shared_ptr<gr::block_detail, (__gnu_cxx::_Lock_policy)2>::operator= (this=0x55c68ed41a68)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1153
#5  std::shared_ptr<gr::block_detail>::operator= (this=0x55c68ed41a68) at /usr/include/c++/11/bits/shared_ptr.h:359
#6  gr::block::set_detail (detail=..., this=0x55c68ed418a8)
    at /gnuradio-3.10.4.0/gnuradio-runtime/lib/../include/gnuradio/block.h:999
#7  gr::block::allocate_detail (this=0x55c68ed418a8, ninputs=ninputs@entry=1, noutputs=noutputs@entry=1, 
    downstream_max_nitems_vec=std::vector of length 1, capacity 1 = {...}, 
    downstream_lcm_nitems_vec=std::vector of length 1, capacity 1 = {...}, 
    downstream_max_out_mult_vec=std::vector of length 1, capacity 1 = {...})
    at /gnuradio-3.10.4.0/gnuradio-runtime/lib/block.cc:420
#8  0x00007f5c105517d4 in gr::flat_flowgraph::allocate_block_detail (this=0x55c68f2e3040, block=...)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1295
#9  0x00007f5c10551fd3 in gr::flat_flowgraph::setup_connections (this=0x55c68f2e3040)
    at /gnuradio-3.10.4.0/gnuradio-runtime/lib/flat_flowgraph.cc:52
#10 0x00007f5c1059ebb2 in gr::top_block_impl::start (this=0x55c68edd2190, 
    max_noutput_items=max_noutput_items@entry=10000000) at /usr/include/c++/11/bits/shared_ptr_base.h:1295
#11 0x00007f5c1059654f in gr::top_block::start (this=0x55c68e9ce7e0, 
    max_noutput_items=max_noutput_items@entry=10000000) at /usr/include/c++/11/bits/unique_ptr.h:173
#12 0x00007f5b8bd29361 in top_block_start_unlocked (
    r=std::shared_ptr<gr::top_block> (use count 3, weak count 1) = {...}, max_noutput_items=10000000)
    at /usr/include/c++/11/bits/shared_ptr_base.h:1295
#13 0x00007f5b8bd2f11e in pybind11::detail::argument_loader<std::shared_ptr<gr::top_block>, int>::call_impl<void, void (*&)(std::shared_ptr<gr::top_block>, int), 0ul, 1ul, pybind11::detail::void_type>(void (*&)(std::shared_ptr<gr::top_block>, int), std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && (
    f=<optimized out>, f=<optimized out>, this=0x7ffde82977d0) at /usr/include/pybind11/cast.h:1207
#14 pybind11::detail::argument_loader<std::shared_ptr<gr::top_block>, int>::call<void, pybind11::detail::void_type, void (*&)(std::shared_ptr<gr::top_block>, int)>(void (*&)(std::shared_ptr<gr::top_block>, int)) && (
    f=<optimized out>, this=0x7ffde82977d0) at /usr/include/pybind11/cast.h:1184
#15 pybind11::cpp_function::initialize<void (*&)(std::shared_ptr<gr::top_block>, int), void, std::shared_ptr<gr::top_block>, int, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(std::shared_ptr<gr::top_block>, int), void (*)(std::shared_ptr<gr::top_block>, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call&) const (__closure=0x0, 
    call=...) at /usr/include/pybind11/pybind11.h:233
#16 pybind11::cpp_function::initialize<void (*&)(std::shared_ptr<gr::top_block>, int), void, std::shared_ptr<gr::top_block>, int, pybind11::name, pybind11::scope, pybind11::sibling>(void (*&)(std::shared_ptr<gr::top_block>, int), void (*)(std::shared_ptr<gr::top_block>, int), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) ()
    at /usr/include/pybind11/pybind11.h:210
#17 0x00007f5b8bc4fbda in pybind11::cpp_function::dispatcher (self=<optimized out>, args_in=0x7f5b83500640, 
    kwargs_in=0x0) at /usr/include/pybind11/pybind11.h:835
#18 0x000055c68d00731e in ?? ()
#19 0x000055c68cffde4b in _PyObject_MakeTpCall ()
#20 0x000055c68cff64f2 in _PyEval_EvalFrameDefault ()
#21 0x000055c68d0155c1 in ?? ()
#22 0x000055c68cff6152 in _PyEval_EvalFrameDefault ()
#23 0x000055c68d007b6c in _PyFunction_Vectorcall ()
#24 0x000055c68cff0675 in _PyEval_EvalFrameDefault ()
#25 0x000055c68cfecde6 in ?? ()
#26 0x000055c68d0e2cb6 in PyEval_EvalCode ()
#27 0x000055c68d10f748 in ?? ()
#28 0x000055c68d10855b in ?? ()
#29 0x000055c68d10f495 in ?? ()
#30 0x000055c68d10e978 in _PyRun_SimpleFileObject ()
#31 0x000055c68d10e673 in _PyRun_AnyFileObject ()
#32 0x000055c68d0ffb7e in Py_RunMain ()
#33 0x000055c68d0d5bbd in Py_BytesMain ()
#34 0x00007f5c16a47d90 in __libc_start_call_main (main=main@entry=0x55c68d0d5b80, argc=argc@entry=2, 
    argv=argv@entry=0x7ffde8298568) at ../sysdeps/nptl/libc_start_call_main.h:58
#35 0x00007f5c16a47e40 in __libc_start_main_impl (main=0x55c68d0d5b80, argc=2, argv=0x7ffde8298568, 
    init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffde8298558)
    at ../csu/libc-start.c:392
#36 0x000055c68d0d5ab5 in _start ()
(gdb) 
ScottTilley commented 1 year ago

Hi Dani, I built gr-satellites from source and the issue is fixed.

s

daniestevez commented 1 year ago

Hi Dani, I built gr-satellites from source and the issue is fixed.

Great! I'm keeping working on this because the problem is doomed to hit more people, so I want to understand better why it happens (and then potentially open a bug in GNU Radio).

daniestevez commented 1 year ago

It's getting late here and I don't have my C++ ABI knowledge fresh, but I'm seeing the following.

(gdb) up
#7  gr::block::allocate_detail (this=0x55c68ed418a8, ninputs=ninputs@entry=1, noutputs=noutputs@entry=1, 
    downstream_max_nitems_vec=std::vector of length 1, capacity 1 = {...}, 
    downstream_lcm_nitems_vec=std::vector of length 1, capacity 1 = {...}, 
    downstream_max_out_mult_vec=std::vector of length 1, capacity 1 = {...})
    at /gnuradio-3.10.4.0/gnuradio-runtime/lib/block.cc:420
420     set_detail(detail);
(gdb) p d_detail
$18 = <error reading variable: Cannot access memory at address 0x9>
(gdb) p &d_detail
$19 = (gr::block_detail_sptr *) 0x55c68ed41a68
(gdb) x/gx &d_detail
0x55c68ed41a68: 0x000055c68edd6f90
(gdb) p *d_detail
$20 = {threaded = 214, thread = 0, d_tpb = {mutex = {m = {__data = {__lock = 0, __count = 0, __owner = 0, 
          __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
        __size = '\000' <repeats 39 times>, __align = 0}}, input_changed = false, input_cond = {internal_mutex = {
        __data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, 
          __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, cond = {
        __data = {__wseq = {__value64 = 0, __value32 = {__low = 0, __high = 0}}, __g1_start = {__value64 = 0, 
            __value32 = {__low = 0, __high = 0}}, __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 0, 
          __wrefs = 2, __g_signals = {0, 0}}, 
        __size = '\000' <repeats 36 times>, "\002\000\000\000\000\000\000\000\000\000\000", __align = 0}}, 
    output_changed = false, output_cond = {internal_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, 
          __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, 
        __size = '\000' <repeats 39 times>, __align = 0}, cond = {__data = {__wseq = {__value64 = 0, __value32 = {
              __low = 0, __high = 0}}, __g1_start = {__value64 = 0, __value32 = {__low = 0, __high = 0}}, 
          __g_refs = {0, 0}, __g_size = {0, 0}, __g1_orig_size = 0, __wrefs = 2, __g_signals = {0, 0}}, 
        __size = '\000' <repeats 36 times>, "\002\000\000\000\000\000\000\000\000\000\000", __align = 0}}}, 
  d_produce_or = 0, d_logger = std::shared_ptr<gr::logger> (use count 1, weak count 0) = {get() = 0x55c68f2e3360}, 
  d_debug_logger = std::shared_ptr<gr::logger> (use count 1, weak count 0) = {get() = 0x55c68f2e2a10}, 
  d_ninputs = 1, d_noutputs = 1, d_input = std::vector of length 1, capacity 1 = {
    std::shared_ptr<gr::buffer_reader> (empty) = {get() = 0x0}}, d_output = std::vector of length 1, capacity 1 = {
    std::shared_ptr<gr::buffer> (use count 1, weak count 0) = {get() = 0x55c68f2e3b60}}, d_done = false, 
  d_consumed = 32604, d_ins_noutput_items = 0, d_avg_noutput_items = 0, d_var_noutput_items = 0, 
  d_total_noutput_items = 0, d_pc_start_time = 3200108019435, d_pc_last_work_time = 0, d_ins_nproduced = 0, 
  d_avg_nproduced = 0, d_var_nproduced = 0, d_ins_input_buffers_full = std::vector of length 1, capacity 1 = {0}, 
  d_avg_input_buffers_full = std::vector of length 1, capacity 1 = {0}, 
  d_var_input_buffers_full = std::vector of length 1, capacity 1 = {0}, 
  d_ins_output_buffers_full = std::vector of length 1, capacity 1 = {0}, 
  d_avg_output_buffers_full = std::vector of length 1, capacity 1 = {0}, 
  d_var_output_buffers_full = std::vector of length 1, capacity 1 = {0}, d_start_of_work = 0, 
  d_end_of_work = -4845002503558665067, d_ins_work_time = 0, d_avg_work_time = 0, d_var_work_time = 0, 
  d_total_work_time = 0, d_avg_throughput = 0, d_pc_counter = 0}
#1  __gnu_cxx::__exchange_and_add_dispatch (__val=-1, __mem=0x9) at /usr/include/c++/11/ext/atomicity.h:101
101       return __exchange_and_add(__mem, __val);

It's odd that when attempting to print d_detail gdb complains about an access to address 0x9, which is indeed the address that gets passed to the exchange and add (and indeed the segfault is caused by an access to 0x9, as lock xadd %eax,0x8(%rdi) with %rdi = 1).

daniestevez commented 1 year ago

The conclusion for this is that it's not possible in general to mix different GNU Radio ABIs (3.10.4 vs 3.10.1 is an ABI change). We should arrange in one way or another to avoid having gr-satellites built for a particular GNU Radio ABI being installed alongside GNU Radio with a different ABI. This is also being tracked in gnuradio/gnuradio#6355. How this will be done, I'm not sure. For the time being, the fix is to avoid installing gr-satellites from its PPA and GNU Radio from its (different) PPA.

I think this issue can be closed now. Do you agree, @ScottTilley?

daniestevez commented 1 year ago

Closing, since the cause of the problem and a solution have been identified.