Closed bernhardschmidt closed 10 years ago
Sorry, too fast. 1.8.5 sends from the wrong MAC (the old one) in the L2 header, but has the updated MAC in the RA option. At least in my tests here. So something has changed.
I just tested this case which was my usual problem from the past
It used to be
at this point, which broke all attached VMs horribly. The current behaviour is still ugly, since the box is sending traffic from a MAC address it doesn't own. Theoretically this might trigger security alerts, but in those cases one should probably set a fixed MAC anyway.
Hi Bernhard,
Do you know how to fix it?
On Sun, May 12, 2013 at 12:49 PM, Bernhard Schmidt <notifications@github.com
wrote:
Sorry, too fast. 1.8.5 sends from the wrong MAC (the old one) in the L2 header, but has the updated MAC in the RA option. At least in my tests here. So something has changed.
I just tested this case which was my usual problem from the past
- create bridge vmbr0, it gets the MAC 8a:7c:c4:8a:95:5d and the LL address fe80::887c:c4ff:fe8a:955d
- add tap-vm1 with MAC fe:16:3e:7c:30:aa, MAC of vmbr0 changes as well
- start radvd
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- add tap-vm2 with MAC fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- remove tap-vm1 (shutdown of guest), vmbr0 changes MAC to fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7f:56:9f
It used to be
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
at this point, which broke all attached VMs horribly. The current behaviour is still ugly, since the box is sending traffic from a MAC address it doesn't own. Theoretically this might trigger security alerts, but in those cases one should probably set a fixed MAC anyway.
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-17784144 .
Hi Bernhard,
On Sun, May 12, 2013 at 12:49 PM, Bernhard Schmidt <notifications@github.com
wrote:
Sorry, too fast. 1.8.5 sends from the wrong MAC (the old one) in the L2 header, but has the updated MAC in the RA option. At least in my tests here. So something has changed.
I just tested this case which was my usual problem from the past
- create bridge vmbr0, it gets the MAC 8a:7c:c4:8a:95:5d and the LL address fe80::887c:c4ff:fe8a:955d
- add tap-vm1 with MAC fe:16:3e:7c:30:aa, MAC of vmbr0 changes as well
- start radvd
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- add tap-vm2 with MAC fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- remove tap-vm1 (shutdown of guest), vmbr0 changes MAC to fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7f:56:9f
It used to be
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
at this point, which broke all attached VMs horribly. The current behaviour is still ugly, since the box is sending traffic from a MAC address it doesn't own. Theoretically this might trigger security alerts, but in those cases one should probably set a fixed MAC anyway.
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-17784144 .
Do you know how to fix it?
On Wed, May 15, 2013 at 6:29 AM, Reuben Hawkins reubenhwk@gmail.com wrote:
Hi Bernhard,
On Sun, May 12, 2013 at 12:49 PM, Bernhard Schmidt < notifications@github.com> wrote:
Sorry, too fast. 1.8.5 sends from the wrong MAC (the old one) in the L2 header, but has the updated MAC in the RA option. At least in my tests here. So something has changed.
I just tested this case which was my usual problem from the past
- create bridge vmbr0, it gets the MAC 8a:7c:c4:8a:95:5d and the LL address fe80::887c:c4ff:fe8a:955d
- add tap-vm1 with MAC fe:16:3e:7c:30:aa, MAC of vmbr0 changes as well
- start radvd
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- add tap-vm2 with MAC fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
- remove tap-vm1 (shutdown of guest), vmbr0 changes MAC to fe:16:3e:7f:56:9f
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7f:56:9f
It used to be
- RA has L2 source fe:16:3e:7c:30:aa, L3 source fe80::887c:c4ff:fe8a:955d, ICMPv6 option fe:16:3e:7c:30:aa
at this point, which broke all attached VMs horribly. The current behaviour is still ugly, since the box is sending traffic from a MAC address it doesn't own. Theoretically this might trigger security alerts, but in those cases one should probably set a fixed MAC anyway.
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-17784144 .
Sorry to joining the discussion without being asked for, but I follow the project a little since I often work with radvd. Is there any code in radvd that fetches the source link-layer address? Without fetching the current address and comparing it to an previously fetched one, we are not even able to detect this change right now, are we?
I'd be happy to provide a patch, but I have no clue were to start. Can you give me a hint were to look at?
Hi Dan,
Thanks for joining in. In the latest commit on the master branch on github, I renamed setup_device_info to update_device_info and I'm calling it in send.c. I didn't test this (other than just to make sure radvd still works in general), but I'm not sure it'll solve the problem, but that may be a good place to start.
Here's the commit..
https://github.com/reubenhwk/radvd/commit/8784bc4f9411d166ba54291942abfa5cce388576
I'm only guessing, but I suspect the problem may be deeper than this. Can you tell if update_device_info and send.c is updating the L2 or the L3 mac address? From Bernhard's email, it seems like the L3 is updated by the L2 is not.
Let me know what you find.
Thanks, Reuben
On Wed, May 15, 2013 at 10:55 AM, Dan Luedtke notifications@github.comwrote:
Sorry to joining the discussion without being asked for, but I follow the project a little since I often work with radvd. Is there any code in radvd that fetches the source link-layer address? Without fetching the current address and comparing it to an previously fetched one, we are not even able to detect this change right now, are we?
I'd be happy to provide a patch, but I have no clue were to start. Can you give me a hint were to look at?
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-17955004 .
Hi all,
So, here is what I found out. No code yet, since I am on the road using a mobile phone. I'll provide a patch later.
struct Interface gets updated by update_device_info() in device-linux.c (or bsd equivalent) using ioctl() for low level socket access. ioctl() is called with SIOCGIFMTU set. But should also be called with SIOCGIFHWADDR set to update the hwaddr as well. That should solve the problem theoretically.
EDIT: I overlooked line 49, where SIOCGIFHWADDR is already used. I will run a few tests to answer your question.
Cheers
Dan
Hi Dan,
It looks like ioctl is called with SIOCGIIFHWADDR just after the MTU is checked. I wondering is this is where the correct L3 address is picked up. But is this also going to fix the L2 problem?
On Thu, May 16, 2013 at 10:43 PM, Dan Luedtke notifications@github.comwrote:
Hi all,
So, here is what I found out. No code yet, since I am on the road using a mobile phone. I'll provide a patch later.
struct Interface gets updated by update_device_info() in device-linux.c (or bsd equivalent) using ioctl() for low level socket access. ioctl() is called with SIOCGIFMTU set. But should also be called with SIOCGIFHWADDR set to update the hwaddr as well. That should solve the problem theoretically. ioctl() in radvd was last changed about 8 years ago, I guess chaning hwaddresses weren't much of a problem back then. cf67f85https://github.com/reubenhwk/radvd/commit/cf67f85c3cc2d45fa6ddce2abeb4df46f0f446a4
Cheers
Dan
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-18044774 .
I finally got the time to run some tests. Here are the results:
[root@tunafish danrl]# uname -a
Linux tunafish 3.9.0+ #7 SMP Sat May 4 08:59:39 CEST 2013 x86_64 GNU/Linux
[root@tunafish radvd]# cat radvd.conf
interface enp0s25 {
AdvSendAdvert on;
MinRtrAdvInterval 3;
MaxRtrAdvInterval 10;
prefix 2001:db8:1:0::/64 {
AdvOnLink on;
AdvAutonomous on;
AdvRouterAddr off;
};
};
[root@tunafish danrl]# ip link show dev enp0s25
2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
link/ether 00:12:34:56:78:aa brd ff:ff:ff:ff:ff:ff
[root@tunafish radvd]# ./radvd --nodaemon -debug=5 --config=radvd.conf
RESULT: RA is sent from 00:12:34:56:78:aa with correct LLAddr
[root@tunafish danrl]# ip link set address 00:12:34:56:78:bb dev enp0s25
[root@tunafish danrl]# ip link show dev enp0s25
2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000
link/ether 00:12:34:56:78:bb brd ff:ff:ff:ff:ff:ff
RESULT: RA is sent from 00:12:34:56:78:bb with correct LLAddr
Note: The source link-local address did not change! This is correct, since the OS did no auto(re)configuration of the link-local address.
I vote for "closed".
PS: While digging through the code got me thinking how https://github.com/reubenhwk/radvd/blob/2e9ac5930b0d300cc7c76658dc1bdaf4ecdd441a/TODO#L27 could be implemented. It looks like a lot of work to me. Are there any ideas yet?
On Sat, May 18, 2013 at 1:50 PM, Dan Luedtke notifications@github.comwrote:
I finally got the time to run some tests. Here are the results:
[root@tunafish danrl]# uname -a Linux tunafish 3.9.0+ #7 SMP Sat May 4 08:59:39 CEST 2013 x86_64 GNU/Linux [root@tunafish radvd]# cat radvd.conf interface enp0s25 { AdvSendAdvert on; MinRtrAdvInterval 3; MaxRtrAdvInterval 10; prefix 2001:db8:1:0::/64 { AdvOnLink on; AdvAutonomous on; AdvRouterAddr off; }; }; [root@tunafish danrl]# ip link show dev enp0s25 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 00:12:34:56:78:aa brd ff:ff:ff:ff:ff:ff
[root@tunafish radvd]# ./radvd --nodaemon -debug=5 --config=radvd.conf
RESULT: RA is sent from 00:12:34:56:78:aa with correct LLAddr
[root@tunafish danrl]# ip link set address 00:12:34:56:78:bb dev enp0s25 [root@tunafish danrl]# ip link show dev enp0s25 2: enp0s25: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT qlen 1000 link/ether 00:12:34:56:78:bb brd ff:ff:ff:ff:ff:ff
RESULT: RA is sent from 00:12:34:56:78:bb with correct LLAddr
Note: The source link-local address did not change! This is correct, since the OS did no auto(re)configuration of the link-local address.
I vote for "closed".
Bernhard, can you verify?
PS: While digging through the code got me thinking how https://github.com/reubenhwk/radvd/blob/2e9ac5930b0d300cc7c76658dc1bdaf4ecdd441a/TODO#L27could be implemented. It looks like a lot of work to me. Are there any ideas yet?
I've been making attempts in bursts spaced out by long delays to organize the code better. For example, send use to (or maybe still does) call into reload config. This is a bit overkill and is doing way more than really needed when a single interface comes up. I think before getting to items in the TODO list, the code should be scrubbed, polished, made more efficient and reduce what it's actually doing to only what it needs to do.
— Reply to this email directly or view it on GitHubhttps://github.com/reubenhwk/radvd/issues/9#issuecomment-18107771 .
I could not reproduce with Kernel 3.8 (Debian sid) and either git HEAD or even git tag release_1_8_5 . I can see that issue with Kernel 3.2 and radvd 1.8.5-1 from a Debian package. I'm going to either temporarily downgrade my workstation or get the source on my server that originally showed this problem.
Okay, I feel a bit smarter (and a lot dumber for making such a noise) after putting some work into it.
First of all, the originally reported bug (radvd keeps sending the wrong LLaddr in L3) was present in release_1_7 and gone in release_1_8. I will tag the Debian bug accordingly.
The other bug (the wrong L2 source address) is there, but a lot less significant than I originally thought. Even current HEAD sends a lot of unsolicited RAs when the MAC address changes.
If I counted correctly, if radvd is run as root it sends 11 RAs, three with the old MAC in both L2 and L3, and eight with the new MAC in both L2 and L3. However, if you run with privsep (-u radvd), as Debian does it, it sends 15 RAs: Two with the old MAC in L2 and L3, one with the old MAC in L2 and the new MAC in L3, and 12 RAs with the new MAC in L2 and L3. See http://pastebin.com/GVmn4jXQ . I somehow focused on that one "wrong" packet in between.
Originally reported to Debian at http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=508466
Interfaces can occasionally change MAC addresses in-flght. The most common example are Linux bridges, that are created using a random MAC and then change the MAC address to the first interface added to the bridge.
If radvd is started before the first interface is added, it learns the old MAC address of the bridge and continues to advertise it in the LL addr option, overwriting the entry in the client's neighbor cache at every advertisement. This creates blackhole periods of about 10-20 seconds each time.
The bug is at least present in 1.8.5. I'm going to test 1.9 later, but since there is nothing in the changes suggesting a fix I assume it is still affected.