TritonDataCenter / illumos-kvm-cmd

qemu-kvm for illumos-kvm
Other
63 stars 40 forks source link

Multicast is broken when using vnics #2

Closed aszeszo closed 12 years ago

aszeszo commented 13 years ago

Hi Joyent!

Multicast seems to be broken when using vnics. I can see multicast packets on the vnic interface on the host os. They don't appear on the interface inside guest. Amongst other things it prevents IPv6 autoconfiguration from working.

Andrzej

rmustacc commented 13 years ago

Hi, can you please include the QEMU options passed and the link properties for the VNICs. We haven't looked at IPv6 support yet so there are going to be some kinks to work out.

aszeszo commented 13 years ago

I configure networking like this:

-net nic,vlan=0,name=net0,model=e1000,macaddr=$MAC \ -net vnic,vlan=0,name=net0,ifname=$VNIC,macaddr=$MAC \

or this:

-net nic,vlan=0,name=net0,model=virtio,macaddr=$MAC \ -net vnic,vlan=0,name=net0,ifname=$VNIC,macaddr=$MAC \

(same behaviour when using both e1000 and virtio drivers)

I am using defaults for vnics, for example:

dladm show-linkprop ubuntu0 LINK PROPERTY PERM VALUE DEFAULT POSSIBLE ubuntu0 autopush rw -- -- -- ubuntu0 zone rw -- -- -- ubuntu0 state r- unknown up up,down ubuntu0 mtu rw 1500 1500 1500 ubuntu0 maxbw rw -- -- -- ubuntu0 cpus rw -- -- -- ubuntu0 cpus-effective r- 5-6 -- -- ubuntu0 pool rw -- -- -- ubuntu0 pool-effective r- -- -- -- ubuntu0 priority rw high high low,medium,high ubuntu0 tagmode rw vlanonly vlanonly normal,vlanonly ubuntu0 protection rw -- -- mac-nospoof, restricted, ip-nospoof, dhcp-nospoof ubuntu0 allowed-ips rw -- -- -- ubuntu0 allowed-dhcp-cids rw -- -- -- ubuntu0 rxrings rw -- -- -- ubuntu0 rxrings-effective r- -- -- -- ubuntu0 txrings rw -- -- -- ubuntu0 txrings-effective r- -- -- -- ubuntu0 txrings-available r- 0 -- -- ubuntu0 rxrings-available r- 0 -- -- ubuntu0 rxhwclnt-available r- 0 -- -- ubuntu0 txhwclnt-available r- 0 -- --

tcpdump watching vnic in GZ is showing multicast traffic.

I have noticed that there are receive_filter() funtions in both e1000.c and virtio-net.c. I am wondering if qemu is intentionally dropping multicast traffic and what I see is actually a feature?

aszeszo commented 13 years ago

Wondering if qemu issue is not similar to the VirtualBox one here:

https://www.virtualbox.org/ticket/9532 https://www.virtualbox.org/changeset/38640

trisk commented 12 years ago

I'm also affected by this one.

I have some vnics on rge0 (external) and one on rge1 (admin). Both rge0 and rge1 are on the same network segment.

[root@mameshiba ~]# dladm show-vnic
LINK         OVER       SPEED MACADDRESS        MACADDRTYPE VID  ZONE
net0         rge1       0     a2:e2:c4:4a:d:7   fixed       0    2db5ca4c-1295-4c0f-aab1-bdb8d5dadfb2
net0         rge0       0     b2:4:f0:f9:ad:bd  fixed       0    0085b669-fade-4569-a273-937fa6f3ea39
net0         rge0       0     a2:11:1e:dd:af:97 fixed       0    f4763d2b-d656-4e4e-a3ae-0a16900bc9c6

I can snoop the vnic in 2db5ca4c-1295-4c0f-aab1-bdb8d5dadfb2 which is joyent brand and see the multicast frames there, although plumbing net0 and restarting ndp does not appear to be sufficient to get autoconfiguration to work.

Snooping in a KVM guest (I cannot snoop the vnic directly in a kvm branded zone) shows outgoing but not incoming multicast frames.

qemu -net options for f4763d2b-d656-4e4e-a3ae-0a16900bc9c6 (generated by vmadmd):

argv[27]: -net
argv[28]: nic,macaddr=a2:11:1e:dd:af:97,vlan=0,name=net0,model=rtl8139
argv[29]: -net
argv[30]: vnic,name=net0,vlan=0,ifname=net0,ip=192.168.4.202,netmask=255.255.255.0,gateway_ip=192.168.4.254,hostname=f4763d2b-d656-4e4e-a3ae-0a16900bc9c6,dns_ip0=192.168.4.4

Here's an example incoming multicast frame which is not visible in the guest:

[root@mameshiba ~]# snoop -v -d rge0 ip6
Using device rge0 (promiscuous mode)
ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 1 arrived at 6:28:46.95386
ETHER:  Packet size = 118 bytes
ETHER:  Destination = 33:33:0:0:0:1, (multicast)
ETHER:  Source      = d8:5d:4c:a4:e5:d4, 
ETHER:  Ethertype = 86DD (IPv6)
ETHER:  
IPv6:   ----- IPv6 Header -----
IPv6:   
IPv6:   Version = 6
IPv6:   Traffic Class = 0
IPv6:   Flow label = 0x0
IPv6:   Payload length = 64
IPv6:   Next Header = 58 (ICMPv6)
IPv6:   Hop Limit = 255
IPv6:   Source address = fe80::a4e2:64ff:feee:3336
IPv6:   Destination address = ff02::1
IPv6:   
ICMPv6:  ----- ICMPv6 Header -----
ICMPv6:  
ICMPv6:  Type = 134 (Router advertisement)
ICMPv6:  Code = 0
ICMPv6:  Checksum = d572
ICMPv6:  Max hops= 64, Router lifetime= 30
ICMPv6:  Managed addr conf flag: NOT SET, Other conf flag: NOT SET
ICMPv6:  Reachable time: 0, Reachable retrans time 0
ICMPv6:  
ICMPv6:  +++ ICMPv6 Prefix option +++
ICMPv6:  Prefix length = 64 
ICMPv6:  Onlink flag: SET, Autonomous addr conf flag: SET
ICMPv6:  Valid Lifetime 604800, Preferred Lifetime 86400
ICMPv6:  Prefix 2001:470:1f07:60::
ICMPv6:  
ICMPv6:  +++ ICMPv6 MTU option +++
ICMPv6:  MTU = 1480 
ICMPv6:  
ICMPv6:  +++ ICMPv6 Source LL Addr option +++
ICMPv6:  Link Layer address: d8:5d:4c:a4:e5:d4
ICMPv6:  

ETHER:  ----- Ether Header -----
ETHER:  
ETHER:  Packet 2 arrived at 6:28:55.97380
ETHER:  Packet size = 118 bytes
ETHER:  Destination = 33:33:0:0:0:1, (multicast)
ETHER:  Source      = d8:5d:4c:a4:e5:d4, 
ETHER:  Ethertype = 86DD (IPv6)
ETHER:  
IPv6:   ----- IPv6 Header -----
IPv6:   
IPv6:   Version = 6
IPv6:   Traffic Class = 0
IPv6:   Flow label = 0x0
IPv6:   Payload length = 64
IPv6:   Next Header = 58 (ICMPv6)
IPv6:   Hop Limit = 255
IPv6:   Source address = fe80::a4e2:64ff:feee:3336
IPv6:   Destination address = ff02::1
IPv6:   
ICMPv6:  ----- ICMPv6 Header -----                                       [0/995]
ICMPv6:  
ICMPv6:  Type = 134 (Router advertisement)
ICMPv6:  Code = 0
ICMPv6:  Checksum = d572
ICMPv6:  Max hops= 64, Router lifetime= 30
ICMPv6:  Managed addr conf flag: NOT SET, Other conf flag: NOT SET
ICMPv6:  Reachable time: 0, Reachable retrans time 0
ICMPv6:  
ICMPv6:  +++ ICMPv6 Prefix option +++
ICMPv6:  Prefix length = 64 
ICMPv6:  Onlink flag: SET, Autonomous addr conf flag: SET
ICMPv6:  Valid Lifetime 604800, Preferred Lifetime 86400
ICMPv6:  Prefix 2001:470:1f07:60::
ICMPv6:  
ICMPv6:  +++ ICMPv6 MTU option +++
ICMPv6:  MTU = 1480 
ICMPv6:  
ICMPv6:  +++ ICMPv6 Source LL Addr option +++
ICMPv6:  Link Layer address: d8:5d:4c:a4:e5:d4
ICMPv6:  

Likewise, outgoing neighbour advertisements and DHCPv6 requests visible in the guest are not visible when snooping rge0 in the global zone.

aszeszo commented 12 years ago

I have looked at how libpcap sets promiscous mode on the interfaces on Solaris and after making the change below I am able to receive multicast traffic inside qemu. Will test it tomorrow in a proper IPv6 enabled environment to see if it also makes IPv6 working inside vms.

diff --git a/net/vnic.c b/net/vnic.c
index 8813de0..ad3b212 100644
--- a/net/vnic.c
+++ b/net/vnic.c
@@ -314,7 +314,9 @@ net_init_vnic(QemuOpts *opts, Monitor *mon, const char *name, VLANState *vlan)
                }
        }

-       if (dlpi_promiscon(vsp->vns_hdl, DL_PROMISC_SAP) != DLPI_SUCCESS) {
+       if ((dlpi_promiscon(vsp->vns_hdl, DL_PROMISC_PHYS) != DLPI_SUCCESS) ||
+               (dlpi_promiscon(vsp->vns_hdl, DL_PROMISC_SAP) != DLPI_SUCCESS) ||
+               (dlpi_promiscon(vsp->vns_hdl, DL_PROMISC_MULTI) != DLPI_SUCCESS)) {
                error_report("vnic: failed to be promiscous with interface %s",
                    ifname);
                return (-1);
aszeszo commented 12 years ago

Just tested it and stateless IPv6 autoconfiguration is working as expected now.

aszeszo commented 12 years ago

Actually, this workaround is much nicer and would allow smartos host with smartos vm with zones on it (and individual vnics per zone) to work:

http://www.listbox.com/member/archive/182179/2012/01/sort/time_rev/page/12/entry/1:310/20120104092815:55555E98-36E0-11E1-90DE-B29A8203A5CD/

rmustacc commented 12 years ago

Sorry for the delay on this one folks. It's there now. The question of how to deal with nested zones, e.g. multiple mac addresses is something that we're working on, but will be tracked by a separate ticket on here.