Closed jech closed 10 years ago
OK, I suppose this is due to neighbor entries timing out prematurely. odhcpd internally watches the kernel neighbor cache to populate (and depopulate) its internal state of which client is on which interface. In case it doesn't (yet) know the destination interface it should try to reach them on every interface except the originating one, maybe something is broken there. I will have a look but chance are I'm not going to be able to do much about for the next 2 weeks. Sorry.
Unfortunately, it looks like the outage is stable under load -- we never recover. The router's neighbour table (ip -6 neigh show
) looks like this:
client-address dev wlan1 FAILED
client-address dev wlan0 lladdr client-MAC STALE
router-address dev eth1 lladdr router-MAC router REACHABLE
other-address dev wlan0 lladdr client-MAC REACHABLE
client-address dev eth0.1 FAILED
Note that the client has two IPv6 addresses (written client-address
and other-address
), due to the use of privacy extensions. The MAC-derived address (other-address) is working fine, while the privacy address is marked as STALE.
--jch
OK, as you can see by the latest commits I'm in the middle of rewriting this now. Also due to the underspecific packet socket draining performance from general forwarding on unrelated interfaces. Hope to have something better ready soon.
Take your time, Steve. I'll be adding to this report as I collect more data, but please don't take it as trying to get you to do stuff.
-- Juliusz
Should be fixed finally.
Nope. Same symptom — launching an IPv6 DHT causes all of my IPv6 connections to hang. Here's the state of the router's neighbour table:
2a01:e34:ec22:84a0:9d39:4cef:e851:89fe dev wlan0 FAILED
2a01:e34:ec22:84a0:9d39:4cef:e851:89fe dev wlan1 lladdr 24:77:03:1a:db:64 STALE
fe80::2677:3ff:fe1a:db64 dev wlan1 lladdr 24:77:03:1a:db:64 STALE
2a01:e34:ec22:84a0::1 dev wlan1 FAILED
fe80::e246:9aff:fe4e:9177 dev eth1 lladdr e0:46:9a:4e:91:77 STALE
2a01:e34:ec22:84a0:9d39:4cef:e851:89fe dev eth0.1 FAILED
2a01:e34:ec22:84a0:9d39:4cef:e851:89fe dev eth1 INCOMPLETE
2a01:e34:ec22:84a0::1 dev eth0.1 FAILED
2a01:e34:ec22:84a0::1 dev eth1 lladdr 00:24:d4:bf:3a:8f router STALE
fe80::224:d4ff:febf:3a8f dev eth1 lladdr 00:24:d4:bf:3a:8f router REACHABLE
This is with version 2014-08-23-24452e1e3e9adfd9d8e183db1aa589f77727f5a7
Hello,
Was this ever fixed? I believe I have a similar issue: I have the same config than Jech, I'll try to do some capture to see what's going on.
I'm running 2014-08-23-24452e1e3e9adfd9d8e183db1aa589f77727f5a7 on Barrier Breaker.
I'm no longer able to reproduce this on Chaos Calmer. Perhaps it was fixed by the kernel upgrade?
Nope, the issue is back :-/
I'm running odcpd (version
2014-06-18-82f3096351911d8c4f3b38e7a5bbeaf75938b6b8
) on an OpenWRT box with three interfaces in thelan
network (unbridged). Since my ISP only gives me a /64, I'm using relay mode:This usually works, but after there has been no IPv6 traffic, I'm getting timeouts on the order of a dozen of seconds or so. It's difficult to reproduce, but the one packet capture that I managed to get seemed to indicate that the OpenWRT box was sending neighbour solicitations over the wan interface rather than the wifi0 one.
I'll provide more info if I can manage to capture a better dump.
--jch