Open mingshuoqiu opened 2 months ago
cc @starbops @rrajendran17
Summary of the debug from NV team:
## Debug steps for all type of VMs:
1. Login in VM network namesapce via "nsenter -t pid -n" command, check interfaces
2. Run "tcpdump -i interface -nnvvSe" in VM network namesapce to start to monitor packet in specific interface
3. Login in VM console, send egress or ingress traffic
4. After step 3, check packets in VM network namesapce
### result for masquerade type VM (its network is management Network):
1. Neuvector can learn network policy
2. There are eth0/k6t-eth0/tap0 interfaces in VM, traffic is in/out in eth0 interface.
In the output of "tcpdump", the mac address used for in/out traffic is eth0's mac address
### result for bridge type VM (its network is management Network):
1. Neuvector can NOT learn network policy for
2. There are eth0/eth0-nic/k6t-eth0/tap0 interfaces in VM
3. In the output of "tcpdump", traffic is in/out in eth0-nic interface.
But the mac address used for in/out traffic is NOT eth0-nic's mac address, and it is not eth0/k6t-eth0/tap0 interface's mac address.
The mac address used for in/out traffic is enpls0's mac address (this interface is displayed in VM console)
(we thought it maybe the root cause why neuvector cannot learn network policy)
### result for bridge type VM (its network is a VM Network):
1. Neuvector can NOT learn network policy
2. There are eth0/37a8eec1ce1-nic/k6t-37a8eec1ce1/tap37a8eec1ce1/pod37a8eec1ce1 interfaces in VM
3. In the output of "tcpdump", traffic is in/out in 37a8eec1ce1-nic interface.
But the mac address used for in/out traffic is NOT 37a8eec1ce1-nic mac address, it is pod37a8eec1ce1's mac address
Just confirmed that there's no ARP cache at all when all NICs of the VM are created with bridge mode even we can see ARP request/reply from the pod network.
So we can't use the arp cache to find out the mac address of the NIC on the VM from ip neigh
like we do the same on the VM with masquerade mode NICs.
We need the fdb
entry of the netns
where the VM belongs to find out the source mac address of particular traffic flow from/to the VM.
To find out the matching mac address of the bridge mode NICs on the VM from the VM traffic, I do the following experiment
ip nets xxx ip link show
to find out the interfaces created by multus-cni
of the pod network.
It should matches the topology of the Harvester Network Deep Dive. https://docs.harvesterhci.io/assets/images/topology-92ab59d983544bad738764a2105c9a06.png Then we can use this information for NeuVector to 1-1 map the pod's NIC to the VM's NIC.
1.When you say "Neuvector cannot learn network policy for VM traffic", are you checking a particular output of a command ? can you post the command you are checking ? 2.Do we have the output of "bridge fdb show" and "bridge vlan show" during the issue time ? 3.Are there any vlan tagged traffic configured for the vlan vm networks?
bridge vlan show
output which has the port information of the pod network. And bridge fdb show
will also show permanent entries for mcast addr and port addr@mingshuoqiu Trying to get more understanding on the issue, 1.Is the communication/ping working to the external environment but there is only discrepancy in the source address shown in UI? or ping is failing to external ?
2.My understanding is, there will be a veth pair connected to the vm interface (created by bridge) and we need to look into the src mac learnt on veth interface corresponding to the vm interface in the bridge/vlan network in "bridge fdb show". Please correct me if I am wrong here. I do not understand in what cases we need to check bridge entries under k6-t and tap interfaces.
NV team, could you share some feedback on what we suggest? Does the bridge fdb show
command helps solve this problem or not?
If there is the traffic to/from the VM, we can see the mac address in bridge fdb show. But this mac address entry is not permanently.
When a POD is deployed, enforcer go through all the interfaces of a POD and get its MAC address as well as IP address per interface. We create a socket and bind to interface to sniff packet, based on packet’s MAC address we decide whether packet is to/from POD. fdb is time sensitive and packet triggerred, it is unreliable to use fdb to get mac address for interface. Is it possible that harvester team can attach the real MAC address to interface? Thanks.
@rrajendran17 any better idea?
If there is the traffic to/from the VM, we can see the mac address in bridge fdb show. But this mac address entry is not permanently.
When a POD is deployed, enforcer go through all the interfaces of a POD and get its MAC address as well as IP address per interface. We create a socket and bind to interface to sniff packet, based on packet’s MAC address we decide whether packet is to/from POD. fdb is time sensitive and packet triggerred, it is unreliable to use fdb to get mac address for interface. Is it possible that harvester team can attach the real MAC address to interface? Thanks.
I think the fdb entry has its life cycle and it should be long enough to record the mapping inside NV when traffic flows. The mapping could exist until user do the VM migration. Or could you point out the better option that you'd like Harvester to offer?
we need to have a correct mac address attached to POD interface(s), this mac address should be used as a SRC/DST ethernet MAC address, we rely on this mac address to decide whether packet is for/from this POD, we only monitor POD's interface for a period of time after POD is brought up or Neuvector enforcer is just deployed. In Harvester bridge mode's case, MAC is only available when traffic is generated which requires infinite monitoring on every POD which is resource consuming, and it is only available through 'bridge fdb show', so it is not feasible foor us to constantly monitor bridge fdb to search for a MAC. The best way is to have a correct MAC address attached to one of POD interfaces, thanks
@starbops do we have the NIC's mac address of the VM which we can get from VMI or other information?
the multus annotation on the launcher pod records the mac address, for example
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "k8s-pod-network",
"ips": [
"10.52.2.87"
],
"default": true,
"dns": {}
},{
"name": "default/workload",
"interface": "pod37a8eec1ce1",
"mac": "c2:f0:c8:37:e4:f4",
"dns": {}
}]
k8s.v1.cni.cncf.io/networks: '[{"name":"workload","namespace":"default","mac":"c2:f0:c8:37:e4:f4","interface":"pod37a8eec1ce1"}]'
the associated VirtualMachine and VirtualMachineInstance will also contain info about the mac addresses
with multus type bridge network, we can figure out MAC address even through 'ip address' command, one of the DOWN interface has the MAC address. But another bridge type network just use the default management network, in this case the MAC address is not available.
@ibrokethecloud the case George mentioned is the NIC created by the following method
Can we have the mac address information for this case?
@gfsuse @mingshuoqiu The traffic from the vm will use mac address configured on that interface in the vm guest os (eg, enp1s0 interface mac). The mac address on the interface in the vm pod will be copied from specific interface from the pod vm based on the network type (masquerade,bridge)
case 1: vm in mgmt network with type masquerade eth0 mac from vm pod is copied to the enp1s0 interface of vm guest os
case 2: vm in mgmt network with type bridge eth0 mac from vm pod is copied to the enp1s0 interface of vm guest os
case 3:vm in vm vlan network with single nic pod interface mac from vm pod is copied the enp1s0 interface of vm guest os
case 4: vm in mgmt network nic-1 and vm in vlan vm network nic-2 eth0 mac from vm pod is copied to the enp1s0 interface of vm guest os pod interface mac from vm pod is copied the enp2s0 interface of vm guest os
Note:The number of pod interfaces on a vm pod will be equal to the number of bridge type vlan vm networks created for a vm
@gfsuse When you scan for macs on vm pod after its deployed, can you scan eth0 interfaces + all pod interfaces created on the vm pod. By this way, when the traffic comes from a vm, you can map it to a particular pod.(both mgmt and bridge type interfaces) One drawback of this is, you could get only mac addresses from this method if vm interface is bridge type.
If you want to get ip addresses also from the interfaces from the vm (secondary interfaces), then you could use the following commands
a.kubectl get vmis
b.kubectl get vmi
Example output of step b, (I created two interfaces in the vm, one in mgmt and other in bridge) interfaces:
I feel the second method "kubectl get vmi
@gfsuse @esther-suse any update for the bridge mode on management network?
currently we don't have function API to get kubernetes VMI resources on our enforcer, need to explore the kubernetes API to see how to get VMI resourse. enforcer can only scan pod/container through runtime on each worker node, best way is stilll that harvester can reflect its VM's interface/mac address correctly on corresponding pod
currently we don't have function API to get kubernetes VMI resources on our enforcer, need to explore the kubernetes API to see how to get VMI resourse.
do you have plan to achive this ?
Describe the bug The VM traffic can be learned when VM's NIC in masquerade mode, but can not be learned when the NIC is bridge mode
To Reproduce Steps to reproduce the behavior:
Ingress test: Neuvector cannot learn network policy and cannot create conversation -- failed
for masquerade type network
Egress test: Neuvector can learn network policy and create conversation -- pass There are two network rules, one is from VM group to external, another is from workload:ip to VM group (workload ip is VM's internal IP such as 10.0.2.2), not sure the second rule is necessary or not. Need to confirm with developer
Ingress test: Neuvector cannot learn network policy, conversation can be generated (action is open) -- failed (Neuvector should learned a network policy from nodes -> VM group)