Closed vchrizz closed 2 years ago
trying to debug this further, again running with -d1 runs for some minutes and then:
TC: chg edge entry 193.238.156.128 > 193.238.159.240, cost (1.000/1.000) 1.000
Received signal Segmentation fault - shutting down
Deleting all routes...
RIB: del prefix 78.41.113.2/32 from 78.41.113.2
...
TC: del edge entry 193.238.158.254 > 78.41.113.201, cost (1.000/1.000) 1.000
Closing sockets...
*** glibc detected *** /usr/sbin/olsrd: double free or corruption (out): 0x004a5280 ***
Received signal Aborted - shutting down
^CReceived signal Interrupt - shutting down
and again running in gdb, running less than a minute:
TC: chg edge entry 78.41.112.70 > 193.238.159.100, cost (1.000/1.000) 1.000
Program received signal SIGPIPE, Broken pipe.
0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
(gdb) bt
#0 0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
#1 0x77d304e4 in write_data (unused=0x0) at olsrd_info.c:377
#2 0x00459cbc in walk_timers (last_run=0x4a59c0) at src/scheduler.c:711
#3 0x004593a0 in olsr_scheduler () at src/scheduler.c:559
#4 0x0043c9f4 in main (argc=7, argv=0x7fffe914) at src/main.c:775
(gdb)
typically i use http/txt/json info plugins on default ports open to everybody (0.0.0.0). this all does happen as soon as i use txtinfo and/or jsoninfo plugins. using only httpinfo plugin there are no issues it seems. at least its running for much longer.
using gdb it reproduces the issue within a minute always with the same message. without gdb, just by using switch -d1 i get different messages after it was running for several minutes.
maybe it still has something to do with https://github.com/OLSR/olsrd/issues/44 ?
can you run a debug build with valgrind?
the devices, where we run olsrd, are ubiquiti edgerouter x series with mipsel cpu, running debian (7.11 wheezy; jessie support is in the works but wont be released soon it seems) so i checked debians package repository for wheezy and there is no valgrind build for mips/mipsel: https://packages.debian.org/wheezy/valgrind i tried the one from debian jessie (8.10) though but:
dpkg: dependency problems prevent configuration of valgrind:
valgrind depends on libc6 (>= 2.16); however:
Version of libc6:mipsel on system is 2.13-38+deb7u11.
then i took the recent source of valgrind and tried crosscompiling it like i do with olsrd, but that fails with:
priv/guest_mips_helpers.c: In function ‘mips_dirtyhelper_rdhwr’:
priv/guest_mips_helpers.c:439: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:443: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:447: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:451: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:455: error: expected string literal before ‘)’ token
make[3]: *** [priv/libvex_mips32_linux_a-guest_mips_helpers.o] Error 1
so actually i would say, i cant run valgrind... any suggestions?
edit: from valgrind README.mips
Limitations
-----------
- Some gdb tests will fail when gdb (GDB) older than 7.5 is used and gdb is
not compiled with '--with-expat=yes'.
- You can not compile tests for DSP ASE if you are using gcc (GCC) older
then 4.6.1 due to a bug in the toolchain.
- Older GCC may have issues with some inline assembly blocks. Get a toolchain
based on newer GCC versions, if possible.
on router running olsrd:
onetrix@test-router:~$ gdb --version
GNU gdb (GDB) 7.4.1-debian
crosscompile toolchain used:
onetrix@debian7dev:~/valgrind/valgrind-3.13.0$ $CC --version
mipsel-linux-gnu-gcc (Debian 4.4.5-8) 4.4.5
Closing since this issue is not updated for quite some time. OLSRd running here on many ER-X currently without know issues. If you do have a recent report or insight feel free to reopen.
while running current master, this happens after about 10-20minutes:
trying to debug in gdb, it runs not more than 1-2 minutes: