OLSR / olsrd

OLSR.org main repository - olsrd v1 - maintained by Freifunk Berlin
Other
84 stars 65 forks source link

munmap_chunk(): invalid pointer #50

Closed vchrizz closed 2 years ago

vchrizz commented 6 years ago

while running current master, this happens after about 10-20minutes:

TC:   chg edge entry 78.41.118.231 > 78.41.119.87, cost (0.497/1.000) 2.008
TC:   chg edge entry 78.41.118.231 > 193.238.156.166, cost (1.000/1.000) 1.000
*** glibc detected *** /usr/sbin/olsrd: munmap_chunk(): invalid pointer: 0x00a9e2f0 ***
Received signal Aborted - shutting down
Deleting all routes...
RIB: del prefix 78.41.112.3/32 from 78.41.112.3
TC: del edge entry 78.41.112.3 > 78.41.112.40, cost (0.815/1.000) 1.226
...
TC: del edge entry 193.238.158.254 > 78.41.113.201, cost (1.000/1.000) 1.000
Closing sockets...
Closing plugins...
Restoring network state
Free all memory...
*** glibc detected *** /usr/sbin/olsrd: munmap_chunk(): invalid pointer: 0x00a9e300 ***
Received signal Aborted - shutting down

trying to debug in gdb, it runs not more than 1-2 minutes:

       *** olsr.org - pre-0.9.7-git_2158276-hash_25fd74c507b25572b2c38787c39ddb5d ***

--- 10:53:37.276781 ---------------------------------------------------- LINKS

IP address       hyst         LQ       ETX
193.238.159.151  0.000  0.748/1.000    1.335
193.238.158.160  0.000  0.732/1.000    1.363

--- 10:53:37.277110 ------------------------------------------------ NEIGHBORS

     IP address Hyst    LQ      ETX     SYM   MPR   MPRS  will
193.238.158.160 0.000   0.732/1.000     1.363   YES   YES   NO    3
193.238.159.151 0.000   0.748/1.000     1.335   YES   NO    NO    3

--- 10:53:37.277437 ----------------------- TWO-HOP NEIGHBORS

IP addr (2-hop)  IP addr (1-hop)  Total cost
193.238.158.160  193.238.159.151  2.335
78.41.119.97     193.238.158.160  2.363
193.238.159.151  193.238.158.160  2.363

Program received signal SIGPIPE, Broken pipe.
0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
(gdb) bt
#0  0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
#1  0x77d344f4 in write_data (unused=0x0) at olsrd_info.c:377
#2  0x00459cec in walk_timers (last_run=0x4a59c0) at src/scheduler.c:711
#3  0x004593d0 in olsr_scheduler () at src/scheduler.c:559
#4  0x0043c9f4 in main (argc=7, argv=0x7fffe914) at src/main.c:775
(gdb)
vchrizz commented 6 years ago

trying to debug this further, again running with -d1 runs for some minutes and then:

TC:   chg edge entry 193.238.156.128 > 193.238.159.240, cost (1.000/1.000) 1.000
Received signal Segmentation fault - shutting down
Deleting all routes...
RIB: del prefix 78.41.113.2/32 from 78.41.113.2
...
TC: del edge entry 193.238.158.254 > 78.41.113.201, cost (1.000/1.000) 1.000
Closing sockets...
*** glibc detected *** /usr/sbin/olsrd: double free or corruption (out): 0x004a5280 ***
Received signal Aborted - shutting down

^CReceived signal Interrupt - shutting down

and again running in gdb, running less than a minute:

TC:   chg edge entry 78.41.112.70 > 193.238.159.100, cost (1.000/1.000) 1.000

Program received signal SIGPIPE, Broken pipe.
0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
(gdb) bt
#0  0x77e93d78 in send () from /lib/mipsel-linux-gnu/libc.so.6
#1  0x77d304e4 in write_data (unused=0x0) at olsrd_info.c:377
#2  0x00459cbc in walk_timers (last_run=0x4a59c0) at src/scheduler.c:711
#3  0x004593a0 in olsr_scheduler () at src/scheduler.c:559
#4  0x0043c9f4 in main (argc=7, argv=0x7fffe914) at src/main.c:775
(gdb)

typically i use http/txt/json info plugins on default ports open to everybody (0.0.0.0). this all does happen as soon as i use txtinfo and/or jsoninfo plugins. using only httpinfo plugin there are no issues it seems. at least its running for much longer.

using gdb it reproduces the issue within a minute always with the same message. without gdb, just by using switch -d1 i get different messages after it was running for several minutes.

maybe it still has something to do with https://github.com/OLSR/olsrd/issues/44 ?

fhuberts commented 6 years ago

can you run a debug build with valgrind?

vchrizz commented 6 years ago

the devices, where we run olsrd, are ubiquiti edgerouter x series with mipsel cpu, running debian (7.11 wheezy; jessie support is in the works but wont be released soon it seems) so i checked debians package repository for wheezy and there is no valgrind build for mips/mipsel: https://packages.debian.org/wheezy/valgrind i tried the one from debian jessie (8.10) though but:

dpkg: dependency problems prevent configuration of valgrind:
 valgrind depends on libc6 (>= 2.16); however:
  Version of libc6:mipsel on system is 2.13-38+deb7u11.

then i took the recent source of valgrind and tried crosscompiling it like i do with olsrd, but that fails with:

priv/guest_mips_helpers.c: In function ‘mips_dirtyhelper_rdhwr’:
priv/guest_mips_helpers.c:439: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:443: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:447: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:451: error: expected string literal before ‘)’ token
priv/guest_mips_helpers.c:455: error: expected string literal before ‘)’ token
make[3]: *** [priv/libvex_mips32_linux_a-guest_mips_helpers.o] Error 1

so actually i would say, i cant run valgrind... any suggestions?

edit: from valgrind README.mips

Limitations
-----------
- Some gdb tests will fail when gdb (GDB) older than 7.5 is used and gdb is
  not compiled with '--with-expat=yes'.
- You can not compile tests for DSP ASE if you are using gcc (GCC) older
  then 4.6.1 due to a bug in the toolchain.
- Older GCC may have issues with some inline assembly blocks. Get a toolchain
  based on newer GCC versions, if possible.
on router running olsrd:
onetrix@test-router:~$ gdb --version
GNU gdb (GDB) 7.4.1-debian

crosscompile toolchain used:
onetrix@debian7dev:~/valgrind/valgrind-3.13.0$ $CC --version
mipsel-linux-gnu-gcc (Debian 4.4.5-8) 4.4.5
mathiashro commented 2 years ago

Closing since this issue is not updated for quite some time. OLSRd running here on many ER-X currently without know issues. If you do have a recent report or insight feel free to reopen.