freifunk-berlin / firmware

DEPRECATED: Build system for Berlin firmware. Please user the pinned falter-repos instead
https://berlin.freifunk.net
GNU General Public License v3.0
73 stars 34 forks source link

Olsrd: Frequently segfaults on startup #573

Closed vasyugan closed 5 years ago

vasyugan commented 6 years ago

Since I upgrade my routers to hedy, olsrd seems to be quite unstable. Often, under OLSR/Neigbours I find no entry at all, which sometimes is remedied by going to Services/OLSRv4 and just clicking "save and apply".

Today, this did not help. I get the following console output:


`Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.728061]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.728061] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 004688e8
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.734991] epc = 00417249 in olsrd[400000+38000]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.739735] ra  = 0041dc4f in olsrd[400000+38000]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.744501]
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Adding interface wlan0-adhoc-2
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: New main address: 10.22.16.4
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 successfully started
Jun 14 07:59:06 16-4-uferwerk-17a olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 08:00:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 08:00:03 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.355422]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.355422] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00468afc
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.362579] epc = 00417249 in olsrd[400000+38000]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.367095] ra  = 0041dc4f in olsrd[400000+38000]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.371878]
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter

`Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.728061]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.728061] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 004688e8
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.734991] epc = 00417249 in olsrd[400000+38000]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.739735] ra  = 0041dc4f in olsrd[400000+38000]
Jun 14 07:59:02 16-4-uferwerk-17a kernel: [120969.744501]
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/rp_filter
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: Adding interface wlan0-adhoc-2
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: New main address: 10.22.16.4
Jun 14 07:59:04 16-4-uferwerk-17a olsrd[26792]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 successfully started
Jun 14 07:59:06 16-4-uferwerk-17a olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 08:00:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 08:00:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 08:00:03 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:01:04 16-4-uferwerk-17a olsrd[26792]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.355422]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.355422] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00468afc
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.362579] epc = 00417249 in olsrd[400000+38000]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.367095] ra  = 0041dc4f in olsrd[400000+38000]
Jun 14 08:01:04 16-4-uferwerk-17a kernel: [121091.371878]
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/rp_filter
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: Adding interface wlan0-adhoc-2
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: New main address: 10.22.16.4
Jun 14 08:01:06 16-4-uferwerk-17a olsrd[27192]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 successfully started
Jun 14 08:01:07 16-4-uferwerk-17a olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway
Jun 14 08:02:10 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 08:04:11 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 08:05:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 08:05:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 08:05:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 08:05:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 08:05:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 08:06:08 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 08:07:30 16-4-uferwerk-17a olsrd[27192]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:07:30 16-4-uferwerk-17a olsrd[27192]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 08:07:30 16-4-uferwerk-17a olsrd[27192]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 14 08:07:30 16-4-uferwerk-17a olsrd[27192]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:07:30 16-4-uferwerk-17a olsrd[27192]: OLSR: sendto IPv4 Bad file descriptor
Jun 14 08:07:30 16-4-uferwerk-17a kernel: [121478.102401]
Jun 14 08:07:30 16-4-uferwerk-17a kernel: [121478.102401] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00468824
Jun 14 08:07:30 16-4-uferwerk-17a kernel: [121478.109396] epc = 00417249 in olsrd[400000+38000]
Jun 14 08:07:30 16-4-uferwerk-17a kernel: [121478.114069] ra  = 0041dc4f in olsrd[400000+38000]
Jun 14 08:07:30 16-4-uferwerk-17a kernel: [121478.118841]
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '1' (was 1) to /proc/sys/net/ipv4/ip_forward
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/tunl0/rp_filter
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/all/send_redirects
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/all/rp_filter
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '0' (was 1) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Writing '0' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/rp_filter
Jun 14 08:07:34 16-4-uferwerk-17a olsrd[28344]: Adding interface wlan0-adhoc-2
SvenRoederer commented 6 years ago

can you check your log provided, there are 2 lines "`Jun 14 07:59:02 16-4-uferwerk-17a olsrd[2385]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects"

What was happening before "Jun 14 07:59:02" ?

vasyugan commented 6 years ago

These would be the first entry originating with olsrd. Here are the preceding lines (going back 30 minutes:

Jun 14 07:20:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:20:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:20:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:20:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:20:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 07:20:43 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:22:44 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:24:34 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:24:41 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953300.cff.xml...
Jun 14 07:24:42 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953360.cff.xml...
Jun 14 07:24:43 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953420.cff.xml...
Jun 14 07:24:44 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953480.cff.xml...
Jun 14 07:24:45 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953540.cff.xml...
Jun 14 07:24:46 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953601.cff.xml...
Jun 14 07:24:47 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953660.cff.xml...
Jun 14 07:24:48 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953720.cff.xml...
Jun 14 07:24:49 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953780.cff.xml...
Jun 14 07:24:50 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953840.cff.xml...
Jun 14 07:25:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:25:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:25:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:25:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:25:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:26:38 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:27:25 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:28:41 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:30:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:30:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:30:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:30:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:30:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:30:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 07:30:37 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:32:36 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:34:27 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:34:41 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953900.cff.xml...
Jun 14 07:34:42 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528953960.cff.xml...
Jun 14 07:34:43 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954020.cff.xml...
Jun 14 07:34:44 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954080.cff.xml...
Jun 14 07:34:45 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954140.cff.xml...
Jun 14 07:34:46 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954201.cff.xml...
Jun 14 07:34:47 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954260.cff.xml...
Jun 14 07:34:48 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954320.cff.xml...
Jun 14 07:34:49 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954380.cff.xml...
Jun 14 07:34:50 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954440.cff.xml...
Jun 14 07:35:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:35:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:35:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:35:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:35:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:36:11 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:36:17 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:38:07 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:40:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:40:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:40:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:40:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:40:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:40:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 07:40:14 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:42:22 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:44:24 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:44:41 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954500.cff.xml...
Jun 14 07:44:42 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954560.cff.xml...
Jun 14 07:44:43 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954620.cff.xml...
Jun 14 07:44:44 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954680.cff.xml...
Jun 14 07:44:45 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954740.cff.xml...
Jun 14 07:44:46 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954801.cff.xml...
Jun 14 07:44:47 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954860.cff.xml...
Jun 14 07:44:48 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954920.cff.xml...
Jun 14 07:44:49 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528954980.cff.xml...
Jun 14 07:44:50 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955040.cff.xml...
Jun 14 07:45:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:45:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:45:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:45:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:45:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:45:09 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:46:20 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:48:16 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:49:05 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:50:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 07:50:12 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:52:18 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:54:11 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:54:41 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955100.cff.xml...
Jun 14 07:54:42 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955160.cff.xml...
Jun 14 07:54:43 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955220.cff.xml...
Jun 14 07:54:44 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955280.cff.xml...
Jun 14 07:54:45 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955340.cff.xml...
Jun 14 07:54:46 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955401.cff.xml...
Jun 14 07:54:47 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955460.cff.xml...
Jun 14 07:54:48 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955520.cff.xml...
Jun 14 07:54:49 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955580.cff.xml...
Jun 14 07:54:50 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955640.cff.xml...
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:56:07 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:57:30 16-4-uferwerk-17a ffwizard: checking for root-password ...
Jun 14 07:58:05 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:58:50 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:58:53 16-4-uferwerk-17a dropbear[26720]: Child connection from 192.168.30.134:40344
Jun 14 07:58:54 16-4-uferwerk-17a dropbear[26720]: Pubkey auth succeeded for 'root' with key md5 32:0a:c6:63:15:22:69:e3:05:81:63:a6:6e:5f:d5:f2 from 192.168.30.134:40344
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:50:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:50:01 16-4-uferwerk-17a ffp-collect: sleeping 279 seconds before upload...
Jun 14 07:50:12 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:52:18 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:54:11 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:54:41 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955100.cff.xml...
Jun 14 07:54:42 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955160.cff.xml...
Jun 14 07:54:43 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955220.cff.xml...
Jun 14 07:54:44 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955280.cff.xml...
Jun 14 07:54:45 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955340.cff.xml...
Jun 14 07:54:46 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955401.cff.xml...
Jun 14 07:54:47 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955460.cff.xml...
Jun 14 07:54:48 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955520.cff.xml...
Jun 14 07:54:49 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955580.cff.xml...
Jun 14 07:54:50 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1528955640.cff.xml...
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /etc/hosts - 4 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/olsr - 3 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/odhcpd - 0 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq[1491]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses
Jun 14 07:55:00 16-4-uferwerk-17a dnsmasq-dhcp[1491]: read /etc/ethers - 0 addresses
Jun 14 07:56:07 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:57:30 16-4-uferwerk-17a ffwizard: checking for root-password ...
Jun 14 07:58:05 16-4-uferwerk-17a odhcp6c[1234]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)
Jun 14 07:58:50 16-4-uferwerk-17a odhcpd[822]: Using a RA lifetime of 0 seconds on br-dhcp
Jun 14 07:58:53 16-4-uferwerk-17a dropbear[26720]: Child connection from 192.168.30.134:40344
Jun 14 07:58:54 16-4-uferwerk-17a dropbear[26720]: Pubkey auth succeeded for 'root' with key md5 32:0a:c6:63:15:22:69:e3:05:81:63:a6:6e:5f:d5:f2 from 192.168.30.134:40344
vasyugan commented 6 years ago

I am seeing this on several of my routers, usually after a fresh reboot. The routers where I am seeing this behaviour are uplinks, providing internet access themselves. We have a roaming network here based on batman, as described at https://wiki.freifunk-potsdam.de/Roaming

vasyugan commented 6 years ago

I never saw this with Kathleen, only with Hedy.

SvenRoederer commented 6 years ago

in your initial post I can see that olsrd is restarted. can you explain why this happened? Was this caused by you?

probably there is some relation to #405 ?

vasyugan commented 6 years ago

I don't know. If olsrd got restarted, this is probably due to me going to services/OLSRv4 and just hitting "save & apply" because this sometime magically fixes things. Right now, the box seems to have totally crashed: I saw on another Freifunk node, that EXT with this particular node is at 0.000 even though SNR is really good. So check the Freifunk/Neighbours page of this node and I found it to be empty, no neighbours listed. Next tried to log on to it to investigate further and I got a "connection refuse" message in my browser and I also found that it even had stopped responding to pings. The last entries in the log that I find on my logserver look totally unsuspicious:

Jun 23 06:30:53 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:32:50 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:33:14 16-4-uferwerk-17a odhcpd[828]: Using a RA lifetime of 0 seconds on br-dhcp Jun 23 06:34:55 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:34:56 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529727900.cff.xml... Jun 23 06:34:57 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529727960.cff.xml... Jun 23 06:34:58 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728020.cff.xml... Jun 23 06:34:59 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728080.cff.xml... Jun 23 06:35:00 16-4-uferwerk-17a dnsmasq[2049]: read /etc/hosts - 4 addresses Jun 23 06:35:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/olsr - 3 addresses Jun 23 06:35:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/odhcpd - 0 addresses Jun 23 06:35:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses Jun 23 06:35:00 16-4-uferwerk-17a dnsmasq-dhcp[2049]: read /etc/ethers - 0 addresses Jun 23 06:35:00 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728140.cff.xml... Jun 23 06:35:01 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728200.cff.xml... Jun 23 06:35:02 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728260.cff.xml... Jun 23 06:35:03 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728320.cff.xml... Jun 23 06:35:04 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728380.cff.xml... Jun 23 06:35:05 16-4-uferwerk-17a ffp-collect: uploading /tmp/collstat/1529728440.cff.xml... Jun 23 06:36:56 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:38:52 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:40:00 16-4-uferwerk-17a dnsmasq[2049]: read /etc/hosts - 4 addresses Jun 23 06:40:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/olsr - 3 addresses Jun 23 06:40:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/odhcpd - 0 addresses Jun 23 06:40:00 16-4-uferwerk-17a dnsmasq[2049]: read /tmp/hosts/dhcp.cfg02411c - 2 addresses Jun 23 06:40:00 16-4-uferwerk-17a dnsmasq-dhcp[2049]: read /etc/ethers - 0 addresses Jun 23 06:40:00 16-4-uferwerk-17a ffp-collect: sleeping 295 seconds before upload... Jun 23 06:40:27 16-4-uferwerk-17a odhcpd[828]: Using a RA lifetime of 0 seconds on br-dhcp Jun 23 06:40:44 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:42:36 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied) Jun 23 06:44:43 16-4-uferwerk-17a odhcp6c[1336]: Failed to send DHCPV6 message to ff02::1:2 (Permission denied)

vasyugan commented 6 years ago

I don't know if it helps but here is a page fault I find in the log prior to the latest crash, which again happened when I was trying to visit the status page:

`

Jun 23 16:25:20 16-5-uferwerk-17b olsrd[3338]: New main address: 10.22.16.5 Jun 23 16:25:20 16-5-uferwerk-17b olsrd[3338]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 successfully started Jun 23 16:25:21 16-5-uferwerk-17b hostapd: wlan0-2: STA b0:c1:9e:58:b8:f3 IEEE 802.11: authenticated Jun 23 16:25:21 16-5-uferwerk-17b hostapd: wlan0-2: STA b0:c1:9e:58:b8:f3 IEEE 802.11: associated (aid 2) Jun 23 16:25:21 16-5-uferwerk-17b hostapd: wlan0-2: AP-STA-CONNECTED b0:c1:9e:58:b8:f3 Jun 23 16:25:21 16-5-uferwerk-17b hostapd: wlan0-2: STA b0:c1:9e:58:b8:f3 RADIUS: starting accounting session 145EECDDC72E4E1B Jun 23 16:25:21 16-5-uferwerk-17b olsrd: /etc/init.d/olsrd: olsrd_setup_smartgw_rules() Notice: Inserting firewall rules for SmartGateway Jun 23 16:25:21 16-5-uferwerk-17b olsrd_hotplug: [OK] ifup: 'wireless0' => 'wlan0-adhoc-2' Jun 23 16:25:21 16-5-uferwerk-17b olsrd_hotplug: [OK] ifup: 'wireless0' => 'wlan0-adhoc-2' Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: WARNING: normally if you use --mssfix and/or --fragment, you should also set --tun-mtu 1500 (currentl y it is 1300) Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: TCP/UDP: Preserving recently used remote address: [AF_INET]94.16.122.222:1195 Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: Socket Buffers: R=[163840->163840] S=[163840->163840] Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: UDP link local: (not bound) Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: UDP link remote: [AF_INET]94.16.122.222:1195 Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: TLS: Initial packet from [AF_INET]94.16.122.222:1195, sid=a7265364 7a8e75e6 Jun 23 16:25:22 16-5-uferwerk-17b olsrd[2518]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped Jun 23 16:25:22 16-5-uferwerk-17b kernel: [ 70.146472] Jun 23 16:25:22 16-5-uferwerk-17b kernel: [ 70.146472] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00000000 Jun 23 16:25:22 16-5-uferwerk-17b kernel: [ 70.154938] epc = 77d62638 in libc.so[77d3a000+92000] Jun 23 16:25:22 16-5-uferwerk-17b kernel: [ 70.160132] ra = 004180ad in olsrd[400000+38000] Jun 23 16:25:22 16-5-uferwerk-17b kernel: [ 70.164921] Jun 23 16:25:22 16-5-uferwerk-17b hostapd: wlan0-dhcp-2: STA e4:e4:ab:18:96:aa IEEE 802.11: authenticated Jun 23 16:25:22 16-5-uferwerk-17b hostapd: wlan0-dhcp-2: STA e4:e4:ab:18:96:aa IEEE 802.11: associated (aid 5) Jun 23 16:25:22 16-5-uferwerk-17b hostapd: wlan0-dhcp-2: AP-STA-CONNECTED e4:e4:ab:18:96:aa Jun 23 16:25:22 16-5-uferwerk-17b hostapd: wlan0-dhcp-2: STA e4:e4:ab:18:96:aa RADIUS: starting accounting session 99F0F12009183B8B Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)[2737]: VERIFY OK: depth=1, C=DE, ST=BRB, L=Potsdam, O=Freifunk Potsdam e.V., CN=Freifunk Potsdam e.V. CA, em ailAddress=info@freifunk-potsdam.de Jun 23 16:25:22 16-5-uferwerk-17b openvpn(pdmvpn)

`

vasyugan commented 6 years ago

This morning the same again: Opened Freifunk/Neighbours, found it to contain no entries, clicked "Administration" to log on and this cause the router to crash, so that it doesn't even respond to pings.

pmelange commented 6 years ago

What device are you using?

Is there a statistics page that we can look at? For example, I am wondering if the router is running out of memory or if the conntrack connection limit is full.

Have you tried disabling local statistics (rrd)? Did you customize the router config in any way?

When pings don't work, are you connected wirelessly or with a cable? Have you tried ipv6 link local pings?

There is also the possibility that the flash chip is corrupt or some other annoying hardware issue. But first I would like to rule out the memory and conntrack ideas before heading down this path.

vasyugan commented 6 years ago

Am 24.06.2018 um 13:54 schrieb pmelange:

What device are you using?

This is about a GL Inet AR-150, but I am seeing the same occasionally on a TP Link CPE 210

Is there a statistics page that we can look at? For example, I am wondering if the router is running out of memory or if the conntrack connection limit is full.

For this node, look at https://monitor.freifunk-potsdam.de/grafana/d/000000008/stat-node-overview?var-hostname=16-4-uferwerk-17a

For the entire site, see https://monitor.freifunk-potsdam.de/grafana/d/000000034/loc-uferwerk-werder

Have you tried disabling local statistics (rrd)?

Is disabled because the Potsdam community uses Grafana instead.

Did you customize the router config in any way?

Configured according to

https://wiki.freifunk-potsdam.de/Kathleen (without VPN) https://wiki.freifunk-potsdam.de/Roaming https://wiki.freifunk-potsdam.de/StatusUpdates

That is, followed the standard configuration for the Freifunk Potsdam community, configured B.A.T.M.A.N. based roaming, again as done in Potsdam, and set up Grafana data collection for the Freifunk Potsdam monitor.

When pings don't work, are you connected wirelessly or with a cable?

Cable, from the local lan, that provides the uplink.

Have you tried ipv6 link local pings?

There is also the possibility that the flash chip is corrupt or some other annoying hardware issue. But first I would like to rule out the memory and conntrack ideas before heading down this path.

Does the grafana page suffice for that?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/freifunk-berlin/firmware/issues/573#issuecomment-399750922, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKOOFW8YM96L_H2MoLSPA1PnsPy6LHUks5t_34QgaJpZM4UnW-H.

-- Auto signature for you   

Johannes Rohr Senior Advisor, Climate Action, Russia

https://www.iwgia.org/en/ E-mail: jr@iwgia.org mailto:jr@iwgia.org Phone: +49-221-7392871 <tel:+492217392871> Skype:johannesrohr  <skype:johannesrohr?call>

https://www.iwgia.org/en/https://www.facebook.com/IWGIA/http://eepurl.com/cC6erXhttps://twitter.com/IWGIA

pmelange commented 6 years ago

Cable, from the local lan, that provides the uplink.

"Local lan" sounds to me like you are on the WAN side of the freifunk router. Is that right?

Have you tried ipv6 link local pings?

Does the grafana page suffice for that?

On the Grafana page, there are no statistics about the number of neighbors and their ETX. Also conntrack is not shown (unless it's the same as "Network Connections").

But it doesn't look like a memory issue.

Would you be able to connect to the serial console of the AR150 and log all the messages which go by? That might provide a bit more detail.

The big question is "What happened between 3:45 and 8:30?"

Just as the router goes offline, there is a spike in the load avg and CPU utilization. Why?

At 8:30 it seems that the router was restarted.

A log from the serial console would help a lot here.

vasyugan commented 6 years ago

I have now temporarily replaced the node by a TP-Link TL-WDR3600 v1. This box doesn't crash but I see the same weird phenomenon, that

  1. uhttpd seems to be crashing, but I can still ssh to the box, after /etc/ini.d/uhttpd restart, the web interface works.
  2. the status pages for olsrd are completely empty, no neighbours, no routes etc, even though ps shows that olsrd is running. After visiting services/olsrdv4 and hitting "save and apply" (without any changes), olsr starts working properly.

So these are in all likelihood not issues related to any particular hardware. I have also witnessed this on TP-Link CPE 210, all of which are acting as uplinks and servers in our roaming network I never saw any of this with Kathleen.

vasyugan commented 6 years ago

Am 24.06.2018 um 16:29 schrieb pmelange:

Cable, from the local lan, that provides the uplink.

"Local lan" sounds to me like you are on the WAN side of the freifunk router. Is that right?

Yes

Have you tried ipv6 link local pings?

No, can try next time

Does the grafana page suffice for that?

On the Grafana page, there are no statistics about the number of neighbors and their ETX. Also conntrack is not shown (unless it's the same as "Network Connections").

Have to ask about that.

But it doesn't look like a memory issue.

Again, I see the empty olsr table on multiple devices.

Would you be able to connect to the serial console of the AR150 and log all the messages which go by? That might provide a bit more detail.

The big question is "What happened between 3:45 and 8:30?"

3:45 the device was rebooted via cron.

Just as the router goes offline, there is a spike in the load avg and CPU utilization. Why?

At 8:30 it seems that the router was restarted.

Yes, that was me, when it no longer responded to anything.

A log from the serial console would help a lot here.

https://github.com/pmelangeCan you advise, how to access the serial console? Is there a way without cracking open the box and thus voiding the warranty?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/freifunk-berlin/firmware/issues/573#issuecomment-399760798, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKOONB_9dJuCNxWnM8PmoeI2s5bRFxcks5t_6JbgaJpZM4UnW-H.

-- Auto signature for you   

Johannes Rohr Senior Advisor, Climate Action, Russia

https://www.iwgia.org/en/ E-mail: jr@iwgia.org mailto:jr@iwgia.org Phone: +49-221-7392871 <tel:+492217392871> Skype:johannesrohr  <skype:johannesrohr?call>

https://www.iwgia.org/en/https://www.facebook.com/IWGIA/http://eepurl.com/cC6erXhttps://twitter.com/IWGIA

vasyugan commented 6 years ago

so again, on the TP-Link WDR 4300 I also see olsrd crashing, here, from the log:

Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16100b added, to 10.22.16.11 Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16100b removed, to - Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16feac added, to 10.22.254.172 Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16feac removed, to - Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16ff31 added, to 10.22.255.49 Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Tunnel tnl_0a16ff31 removed, to - Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/all/send_redirects Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: Writing '1' (was 0) to /proc/sys/net/ipv4/conf/wlan0-adhoc-2/send_redirects Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4 Bad file descriptor Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4 Bad file descriptor Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114]' Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00468858 Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.690563] epc = 00417249 in olsrd[400000+38000] Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.695410] ra = 0041dc4f in olsrd[400000+38000] Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.700196]**** Jun 24 17:51:52 16-24-uferwerk-16 procd: /etc/rc.d/S99vnstat`

After than I ran /etc/init.d/olsrd restart three times. The first time, the segfault reoccurred, the second time it looked ok, but the status page remained empty and only after the third time, the status page was filled with entries.

vasyugan commented 6 years ago

So now I am seeing the issue on one TP Link CPE 210 which is connected to the same uplink router, and what I am seeing in the log is again

Jun 25 08:25:57 16-5-uferwerk-17b kernel: [17729.309320] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 004689c0

running /etc/init.d/olsrd restart makes it run again.

vasyugan commented 6 years ago

just now after rebooting two other CPEs which are also uplinks, but connected to a different router in another building, I found the same: The olsrd status pages were empty, only after running /etc/init.d/olsrd restart, they began to be filled.

SvenRoederer commented 6 years ago

Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: olsr.org - 0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4 Bad file descriptor Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4 Bad file descriptor Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114]' Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114] do_page_fault(): sending SIGSEGV to olsrd for invalid write access to 00468858 Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.690563] epc = 00417249 in olsrd[400000+38000] Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.695410] ra = 0041dc4f in olsrd[400000+38000] Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.700196]

https://github.com/freifunk-berlin/firmware/issues/513 should demystify your segfault-issue.

but this is not explaining, why the olsrd stops randomly. probably it's fixed in the recent olsrd v0.9.6.x. Have you tried master or SAm0815_experimental-branch?

vasyugan commented 6 years ago

Am 25.06.2018 um 19:37 schrieb Sven Roederer:

Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: olsr.org -
0.9.0.3-git_788312c-hash_c7f667fe7cf42baa389872842561b6c3 stopped
Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4
Bad file descriptor
Jun 24 17:51:52 16-24-uferwerk-16 olsrd[2545]: OLSR: sendto IPv4
Bad file descriptor
Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114]'
Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.682114]
do_page_fault(): sending SIGSEGV to olsrd for invalid write access
to 00468858
Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.690563] epc =
00417249 in olsrd[400000+38000]
Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.695410] ra =
0041dc4f in olsrd[400000+38000]
Jun 24 17:51:52 16-24-uferwerk-16 kernel: [ 60.700196]

513 https://github.com/freifunk-berlin/firmware/issues/513 should

demystify your segfault-issue.

but this is not explaining, why the olsrd stops randomly. probably it's fixed in the recent olsrd v0.9.6.x. Have you tried master or SAm0815_experimental-branch?

I have used the stable releases, hedy 1.0.0 and 1.0.1

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/freifunk-berlin/firmware/issues/573#issuecomment-400034401, or mute the thread https://github.com/notifications/unsubscribe-auth/ADKOOB-bbs0pmkxkDoMbPv1v3s50OSCGks5uAR_xgaJpZM4UnW-H.

-- Auto signature for you   

Johannes Rohr Senior Advisor, Climate Action, Russia

https://www.iwgia.org/en/ E-mail: jr@iwgia.org mailto:jr@iwgia.org Phone: +49-221-7392871 <tel:+492217392871> Skype:johannesrohr  <skype:johannesrohr?call>

https://www.iwgia.org/en/https://www.facebook.com/IWGIA/http://eepurl.com/cC6erXhttps://twitter.com/IWGIA

pmelange commented 5 years ago

This seems to be solved upstream (try out a development version of the firmware). If this is still an issue, please reopen.