apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
2.05k stars 1.1k forks source link

The VPC Redundant router "Virtual routers" can not work as expected #7838

Closed xuanyuanaosheng closed 1 year ago

xuanyuanaosheng commented 1 year ago
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION

CloudStack 4.18.0.0

OS / ENVIRONMENT

OS: oracle linux 8

CONFIGURATION

The network test is:

# From kvm001

# ping -I cloudbr0 10.26.128.254
PING 10.26.128.254 (10.26.128.254) from 10.26.128.25 cloudbr0: 56(84) bytes of data.
64 bytes from 10.26.128.254: icmp_seq=1 ttl=255 time=1.08 ms
64 bytes from 10.26.128.254: icmp_seq=2 ttl=255 time=1.05 ms
^C
--- 10.26.128.254 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 1.046/1.062/1.079/0.036 ms

# ping -I cloudbr1 10.71.231.42
PING 10.71.231.42 (10.71.231.42) from 10.71.231.41 cloudbr1: 56(84) bytes of data.
64 bytes from 10.71.231.42: icmp_seq=1 ttl=64 time=0.191 ms
64 bytes from 10.71.231.42: icmp_seq=2 ttl=64 time=0.177 ms
64 bytes from 10.71.231.42: icmp_seq=3 ttl=64 time=0.181 ms
^C
--- 10.71.231.42 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2086ms
rtt min/avg/max/mdev = 0.177/0.183/0.191/0.005 ms

image

image

image

The Virtual routers is r-30912-VM and r-30911-VM and the IP is 10.71.227.33 image

image

root@r-30912-VM:/opt/cloud/bin# sh -x checkrouter.sh 
+ STATUS=UNKNOWN
+ systemctl is-active keepalived
+ [ active != active ]
+ sed -e s/[,\"]//g
+ awk {print $2;}
+ grep type
+ cat /etc/cloudstack/cmdline.json
+ ROUTER_TYPE=vpcrouter
+ [ vpcrouter = router ]
+ awk {print $9;}+ grep state

+ ip -4 addr show dev eth1
+ ROUTER_STATE=UP
+ [ UP = UP ]
+ STATUS=PRIMARY
+ echo Status: PRIMARY
Status: PRIMARY

root@r-30911-VM:/opt/cloud/bin# sh -x checkrouter.sh 
+ STATUS=UNKNOWN
+ systemctl is-active keepalived
+ [ active != active ]
+ sed -e s/[,\"]//g
+ awk {print $2;}
+ grep type
+ cat /etc/cloudstack/cmdline.json
+ ROUTER_TYPE=vpcrouter
+ [ vpcrouter = router ]
+ awk {print $9;}
+ grep state
+ ip -4 addr show dev eth1
+ ROUTER_STATE=UP
+ [ UP = UP ]
+ STATUS=PRIMARY
+ echo Status: PRIMARY
Status: PRIMARY

The Result:

I have restarted the VPC, It can not work. I have restarted the Virtual routers, It also can not work.

The cloud.log (r-30912-VM) is

cloud.log

EXPECTED RESULTS

The "Virtual routers" works fine.

Are there any commands for further debugging and How to solve this problem ?

weizhouapache commented 1 year ago

@xuanyuanaosheng from what you described, it is a networking configuration issue. the VMs (including user VMs and VRs) should be able to connect to other VMs on different host.

xuanyuanaosheng commented 1 year ago

@weizhouapache Yes, as you said: the VMs (including user VMs and VRs) should be able to connect to other VMs on different host.

Cloud you please give some advices about how to do some further debugging about the networking?

kvm001 has two network interfaces :

# nmcli c
NAME          UUID                                  TYPE      DEVICE       
cloudbr0      d4b789ba-7321-548d-dabd-5c4150da0266  bridge    cloudbr0     
cloud0        116fd927-1767-425c-8728-ed69db59f3cc  bridge    cloud0       
cloudbr1      6df8d4a7-7ee1-528a-dcf9-20d35734f675  bridge    cloudbr1     
eno49         0650d63c-0244-4852-b0aa-ca5d8a64d8cb  ethernet  eno49        
eno50         46da1a8f-615e-4649-be64-fc8e1c7dd264  ethernet  eno50        
vnet26        14a80703-a524-4890-a9fd-c3e77e7bd0d7  tun       vnet26       
vnet27        ba5349ea-7bc6-4821-bed5-f39a5dba0f10  tun       vnet27       
vnet28        bf04b42c-3e88-4746-8f41-374e0912442a  tun       vnet28       
vnet29        519f1e5e-d748-4b5d-af3c-c26337791e46  tun       vnet29       
vnet30        9e1f3e0b-9d29-4229-b948-db7b9e1a3f41  tun       vnet30       
vnet31        42236339-c028-4aef-8a3b-62ae2cee3a93  tun       vnet31       
vnet32        e740caf8-499c-41fa-945e-31f7e1bcb161  tun       vnet32       
vnet33        a4f0100f-0584-4204-853b-dda5228b5c6d  tun       vnet33       
vnet38        fbe65a8c-1a64-456f-9349-0a163e50e10d  tun       vnet38       
vnet39        2a3c273e-d524-434a-b978-f91ced137b5f  tun       vnet39       
vnet40        ee67179f-db52-402f-ba70-2e102bd9d39f  tun       vnet40       
vnet41        6e6b3475-b37c-4296-8628-bed5e3f85901  tun       vnet41       
breno49-2227  77b7a84a-9021-409f-bda0-2bf0d881eaad  bridge    breno49-2227 
brvx-2827     d69728b3-522e-48b7-80f0-1f137adb0272  bridge    brvx-2827    
brvx-2851     ee9bc698-df7c-4db0-9916-ce783b22f8a1  bridge    brvx-2851    
brvx-2891     2bda6291-dc27-49be-9d11-34a79c054fa3  bridge    brvx-2891    
brvx-2897     a0c9d46a-937a-4dbd-a916-6edc29fd311c  bridge    brvx-2897    
eno49.2128    ecb15010-4518-9d8b-5024-e5d1fe34559e  vlan      eno49.2128   
eno49.2227    f6b7c06b-bf4c-46a3-b6b4-e5d0bc78d12f  vlan      eno49.2227   
eno50.2230    599d2adf-ec36-333d-2894-68b19f0336ea  vlan      eno50.2230   
vxlan2827     a004f024-e0b2-4e5a-8085-8188c8d5b976  vxlan     vxlan2827    
vxlan2851     e33e1552-4a38-46ac-8a35-4b92468f9b79  vxlan     vxlan2851    
vxlan2891     be12f97c-3d1b-4fbe-b482-84a2e359d7c6  vxlan     vxlan2891    
vxlan2897     38df8496-49cd-4801-95ee-01c09d95ea51  vxlan     vxlan2897

kvm002 has four network interface:

# nmcli c
NAME          UUID                                  TYPE      DEVICE       
cloudbr0      d4b789ba-7321-548d-dabd-5c4150da0266  bridge    cloudbr0     
cloud0        a20f7bd6-e232-46ad-81af-e777cec5b923  bridge    cloud0       
cloudbr1      6df8d4a7-7ee1-528a-dcf9-20d35734f675  bridge    cloudbr1     
bond0         ad33d8b0-1f7b-cab9-9447-ba07f855b143  bond      bond0        
bond1         92306dc1-4142-23de-097b-b1464cfab5ee  bond      bond1        
vnet0         d1e91d5e-c78b-476a-a822-222a5700e126  tun       vnet0        
vnet1         4bbf0285-68ed-4d73-82c7-a3420157fa80  tun       vnet1        
vnet2         dee76d9a-e3b9-4eab-b87d-425565b0f4ca  tun       vnet2        
vnet3         f4ad91d5-0a70-400e-a9f2-145b02bb2b6b  tun       vnet3        
vnet32        ff1f2bd9-cdd1-4418-b4a9-9db7b97c7195  tun       vnet32       
vnet33        e7d51110-85c6-448f-b362-b7f78790e669  tun       vnet33       
vnet37        edf0af2e-12b5-4324-bc25-73efa72aa702  tun       vnet37       
vnet39        1606dc4b-9cc7-4643-877c-06e2d4ac8aea  tun       vnet39       
vnet4         6a19466e-409e-4854-ac1f-2588bcf8e079  tun       vnet4        
vnet44        3263802b-a949-406e-90f2-fe06711af79c  tun       vnet44       
vnet45        0c75f25f-94c2-4d78-bcee-7c4bbc91bb02  tun       vnet45       
vnet46        e46a9697-e346-4e57-9b2e-77962e3e0c3b  tun       vnet46       
vnet47        be28ae76-8345-4421-93ce-6de4cf6d32c6  tun       vnet47       
vnet5         e7f45169-168f-486c-9275-70035aa81bc1  tun       vnet5        
bond0.2128    7e50ddd7-3e4d-1638-2cf0-96a9c9cc16fb  vlan      bond0.2128   
bond0.2227    e2bae12d-193c-45cd-aa88-5b9d6cdacaf0  vlan      bond0.2227   
bond0-slave   a1420bd0-2cbe-45b4-b92e-7ba22aa148ef  ethernet  eno1         
bond0-slave   d8d48df8-95f5-43af-afc5-433fc81f322e  ethernet  eno2         
bond1.2230    f8f49649-1de5-ce03-2ba0-986c75b61b87  vlan      bond1.2230   
bond1-slave   c46fbae9-2d35-4871-88d9-112b0897d776  ethernet  eno4         
bond1-slave   e07e383b-8870-4d6d-b994-92f1657ebcb0  ethernet  eno3         
brbond0-2227  62e1b7d0-6b7b-4d65-8b03-b91e5f787db3  bridge    brbond0-2227 
brvx-2827     4425383b-a1a9-4fc0-8757-4b0a3ad58b7d  bridge    brvx-2827    
brvx-2851     d1fb9ed8-2e95-421a-a788-2a8fb021e9a9  bridge    brvx-2851    
brvx-2891     5632b810-be0c-4f45-802d-9fce33bb4846  bridge    brvx-2891    
brvx-2897     10ae0b69-0643-4a54-b647-2f56d273c50a  bridge    brvx-2897    
vxlan2827     8c44e0fe-c7b6-4968-9ba4-23318e1d5f10  vxlan     vxlan2827    
vxlan2851     fed18d5d-6292-4207-b9e8-cb6ed3a0ce8d  vxlan     vxlan2851    
vxlan2891     21e0b696-310c-46e8-9a34-08d40ea40cd5  vxlan     vxlan2891    
vxlan2897     9be0e9b5-2a80-4107-bc20-9b90d88538e4  vxlan     vxlan2897

I have checked the VR: /opt/cloud/bin/checkrouter.sh using ip -4 addr show dev eth1 | grep state, The two VR states are all UP, Theoretically one should be up and one should be down.:

root@r-30916-VM:/opt/cloud/bin# ip -4 addr show dev eth1 | grep state
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

root@r-30915-VM:~# ip -4 addr show dev eth1 | grep state
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000

image

image

I want to know how these two VR to check the PRIMARY/BACKUP status?

weizhouapache commented 1 year ago

@xuanyuanaosheng The redundant VRs sends VRRP packets periodically to another VR. If the communication fails, both VRs will be PRIMARY. This is how keepalived works.

you need to configure your network to support VXLAN tunnel, for example using EVPN / BGP.

xuanyuanaosheng commented 1 year ago

@weizhouapache I recreate a new vpc : test and add tier: 10.28.29.0/24. The redundant router are r-30932-VM and r-30933-VM.
The r-30932-VM eth2 IP:10.28.29.42 ,the r-30933-VM eth2 IP:10.28.29.229 The two virtual routers can ping each other using eth2, But a few minitues the two virtual routers can not ping each other using eth2.

image

and the r-30932-VM IP become

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:0f:b9:00:02 brd ff:ff:ff:ff:ff:ff
    altname enp0s9
    altname ens9
    inet 10.28.29.42/24 brd 10.28.29.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 10.28.29.254/24 brd 10.28.29.255 scope global secondary eth2
       valid_lft forever preferred_lft forever

and the r-30933-VM IP become

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:1c:39:00:03 brd ff:ff:ff:ff:ff:ff
    altname enp0s9
    altname ens9
    inet 10.28.29.229/24 brd 10.28.29.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 10.28.29.254/24 brd 10.28.29.255 scope global secondary eth2
       valid_lft forever preferred_lft forever

could you please give some advices?

weizhouapache commented 1 year ago

@weizhouapache I recreate a new vpc : test and add tier: 10.28.29.0/24. The redundant router are r-30932-VM and r-30933-VM. The r-30932-VM eth2 IP:10.28.29.42 ,the r-30933-VM eth2 IP:10.28.29.229 The two virtual routers can ping each other using eth2, But a few minitues the two virtual routers can not ping each other using eth2.

image

and the r-30932-VM IP become

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:0f:b9:00:02 brd ff:ff:ff:ff:ff:ff
    altname enp0s9
    altname ens9
    inet 10.28.29.42/24 brd 10.28.29.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 10.28.29.254/24 brd 10.28.29.255 scope global secondary eth2
       valid_lft forever preferred_lft forever

and the r-30933-VM IP become

4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:00:1c:39:00:03 brd ff:ff:ff:ff:ff:ff
    altname enp0s9
    altname ens9
    inet 10.28.29.229/24 brd 10.28.29.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet 10.28.29.254/24 brd 10.28.29.255 scope global secondary eth2
       valid_lft forever preferred_lft forever

could you please give some advices?

@xuanyuanaosheng it seems both VRs become PRIMARY, right ? If you migrate the VRs to same host, everything will work as expected.

I have experienced a similar issue before, which was caused by mlag settings on upstream routers. how many routers do you use ?

xuanyuanaosheng commented 1 year ago

@weizhouapache Yes, as you said both VRs become PRIMARY.

The VRs can not migrate to same host. The error message:

Failed to load hosts.: Error: Request failed with status code 431

image

I migrate the vms to same host, They can ping each other.


Our env:

Two blade servers uplinked to two Cisco nexus 7700 core switches, with no mlag/lacp/vpc between them.

企业微信截图_16920880631091

Any idears?

weizhouapache commented 1 year ago

@weizhouapache Yes, as you said both VRs become PRIMARY.

The VRs can not migrate to same host. The error message:

Failed to load hosts.: Error: Request failed with status code 431

image

I migrate the vms to same host, They can ping each other.

Our env:

Two blade servers uplinked to two Cisco nexus 7700 core switches, with no mlag/lacp/vpc between them.

企业微信截图_16920880631091

Any idears?

@xuanyuanaosheng sorry I cannot give you advice how to configure routers correctly to support MLAG/VRRP. It would be good to ask network specialists to check the configurations.

xuanyuanaosheng commented 1 year ago

@weizhouapache

I have dump on one blade server using tcpdump -i cloudbr1 port 8472 -nn -s0 -e -vvv:

the detail as:

13:50:05.839405 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 140: (tos 0x0, ttl 10, id 58443, offset 0, flags [none], proto UDP (17), length 126)
    10.71.231.43.60487 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 90: (tos 0x0, ttl 64, id 53547, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.21.93.37196 > 10.25.28.26.123: [udp sum ok] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 7 (128s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   2349871944.725348397 (1974/06/19 22:12:24)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 2349871944.725348397 (1974/06/19 22:12:24)
13:50:05.840273 20:67:7c:19:67:78 > ac:16:2d:ab:e3:e4, ethertype IPv4 (0x0800), length 140: (tos 0x0, ttl 10, id 53841, offset 0, flags [none], proto UDP (17), length 126)
    10.71.231.42.54654 > 10.71.231.43.8472: [bad udp cksum 0xe35f -> 0xae97!] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:1e:2d:00:02 > 02:00:46:bf:00:03, ethertype IPv4 (0x0800), length 90: (tos 0xc0, ttl 60, id 36401, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.26.123 > 10.28.21.93.37196: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 7 (128s), precision -23
    Root Delay: 0.082061, Root dispersion: 0.046310, Reference-ID: 58.176.194.96
      Reference Timestamp:  3901930443.431726703 (2023/08/25 13:34:03)
      Originator Timestamp: 2349871944.725348397 (1974/06/19 22:12:24)
      Receive Timestamp:    3901931405.840823480 (2023/08/25 13:50:05)
      Transmit Timestamp:   3901931405.840973115 (2023/08/25 13:50:05)
        Originator - Receive Timestamp:  +1552059461.115475082
        Originator - Transmit Timestamp: +1552059461.115624717
13:50:06.672705 20:67:7c:19:67:78 > 01:00:5e:00:0a:f6, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 10, id 38002, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.42.41486 > 239.0.10.246.8472: [bad udp cksum 0xebb3 -> 0x2505!] OTV, flags [I] (0x08), overlay 0, instance 2806
02:00:6e:3c:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.17.254 tell 10.28.17.30, length 28
13:50:07.696709 20:67:7c:19:67:78 > 01:00:5e:00:0a:f6, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 10, id 38718, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.42.41486 > 239.0.10.246.8472: [bad udp cksum 0xebb3 -> 0x2505!] OTV, flags [I] (0x08), overlay 0, instance 2806
02:00:6e:3c:00:01 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.17.254 tell 10.28.17.30, length 28
13:50:07.919105 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 232: (tos 0x0, ttl 10, id 58519, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.43.45383 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 182: (tos 0x0, ttl 64, id 3640, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.93.50574 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 140
    Facility daemon (3), Severity info (6)
    Msg: Aug 25 13:50:07 debian231 telegraf[857]: 2023-08-25T05:50:07Z W! [outputs.influxdb] Metric buffer overflow; 31 metrics have been dropped
    0x0000:  3c33 303e 4175 6720 3235 2031 333a 3530
    0x0010:  3a30 3720 6465 6269 616e 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3537 5d3a 2032 3032
    0x0030:  332d 3038 2d32 3554 3035 3a35 303a 3037
    0x0040:  5a20 5721 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 4d65 7472 6963 2062
    0x0060:  7566 6665 7220 6f76 6572 666c 6f77 3b20
    0x0070:  3331 206d 6574 7269 6373 2068 6176 6520
    0x0080:  6265 656e 2064 726f 7070 6564
13:50:07.919144 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 232: (tos 0x0, ttl 10, id 58520, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.43.59843 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 182: (tos 0x0, ttl 64, id 9111, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.93.51379 > 10.26.0.17.51554: [udp sum ok] UDP, length 140
13:50:07.919256 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 339: (tos 0x0, ttl 10, id 58521, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.43.45383 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 289: (tos 0x0, ttl 64, id 3641, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.21.93.50574 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 247
    Facility daemon (3), Severity info (6)
    Msg: Aug 25 13:50:07 debian231 telegraf[857]: 2023-08-25T05:50:07Z E! [outputs.influxdb] When writing to [http://localhost:8086]: failed doing req: Post "http://localhost:8086/write?db=telegraf": dial tcp 127.0.0.1:8086: connect: connection refused
    0x0000:  3c33 303e 4175 6720 3235 2031 333a 3530
    0x0010:  3a30 3720 6465 6269 616e 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3537 5d3a 2032 3032
    0x0030:  332d 3038 2d32 3554 3035 3a35 303a 3037
    0x0040:  5a20 4521 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 5768 656e 2077 7269
    0x0060:  7469 6e67 2074 6f20 5b68 7474 703a 2f2f
    0x0070:  6c6f 6361 6c68 6f73 743a 3830 3836 5d3a
    0x0080:  2066 6169 6c65 6420 646f 696e 6720 7265
    0x0090:  713a 2050 6f73 7420 2268 7474 703a 2f2f
    0x00a0:  6c6f 6361 6c68 6f73 743a 3830 3836 2f77
    0x00b0:  7269 7465 3f64 623d 7465 6c65 6772 6166
    0x00c0:  223a 2064 6961 6c20 7463 7020 3132 372e
    0x00d0:  302e 302e 313a 3830 3836 3a20 636f 6e6e
    0x00e0:  6563 743a 2063 6f6e 6e65 6374 696f 6e20
    0x00f0:  7265 6675 7365 64
13:50:07.919305 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 231: (tos 0x0, ttl 10, id 58522, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.43.45383 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 181: (tos 0x0, ttl 64, id 3642, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.21.93.50574 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 139
    Facility daemon (3), Severity info (6)
    Msg: Aug 25 13:50:07 debian231 telegraf[857]: 2023-08-25T05:50:07Z E! [agent] Error writing to outputs.influxdb: could not write any address
    0x0000:  3c33 303e 4175 6720 3235 2031 333a 3530
    0x0010:  3a30 3720 6465 6269 616e 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3537 5d3a 2032 3032
    0x0030:  332d 3038 2d32 3554 3035 3a35 303a 3037
    0x0040:  5a20 4521 205b 6167 656e 745d 2045 7272
    0x0050:  6f72 2077 7269 7469 6e67 2074 6f20 6f75
    0x0060:  7470 7574 732e 696e 666c 7578 6462 3a20
    0x0070:  636f 756c 6420 6e6f 7420 7772 6974 6520
    0x0080:  616e 7920 6164 6472 6573 73
13:50:07.919331 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 339: (tos 0x0, ttl 10, id 58523, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.43.59843 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 289: (tos 0x0, ttl 64, id 9112, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.21.93.51379 > 10.26.0.17.51554: [udp sum ok] UDP, length 247
13:50:07.919349 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 231: (tos 0x0, ttl 10, id 58524, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.43.59843 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 181: (tos 0x0, ttl 64, id 9113, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.21.93.51379 > 10.26.0.17.51554: [udp sum ok] UDP, length 139
13:50:11.311207 ac:16:2d:ab:e3:e4 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 10, id 60371, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.43.36573 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:46:bf:00:03 > 02:00:1e:2d:00:02, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.21.254 tell 10.28.21.93, length 28
13:50:11.311483 20:67:7c:19:67:78 > ac:16:2d:ab:e3:e4, ethertype IPv4 (0x0800), length 92: (tos 0x0, ttl 10, id 58172, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.42.41486 > 10.71.231.43.8472: [bad udp cksum 0xe32f -> 0xe5e0!] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:1e:2d:00:02 > 02:00:46:bf:00:03, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 10.28.21.254 is-at 02:00:1e:2d:00:02, length 28
13:50:13.070991 f4:03:43:9b:a4:c8 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 232: (tos 0x0, ttl 10, id 61949, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.41.43183 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:3c:f3:00:04 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 182: (tos 0x0, ttl 64, id 7503, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.129.57996 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 140
    Facility daemon (3), Severity info (6)
    Msg: Aug 25 13:50:12 debian251 telegraf[872]: 2023-08-25T05:50:12Z W! [outputs.influxdb] Metric buffer overflow; 31 metrics have been dropped
    0x0000:  3c33 303e 4175 6720 3235 2031 333a 3530
    0x0010:  3a31 3220 6465 6269 616e 3235 3120 7465
    0x0020:  6c65 6772 6166 5b38 3732 5d3a 2032 3032
    0x0030:  332d 3038 2d32 3554 3035 3a35 303a 3132
    0x0040:  5a20 5721 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 4d65 7472 6963 2062
    0x0060:  7566 6665 7220 6f76 6572 666c 6f77 3b20
    0x0070:  3331 206d 6574 7269 6373 2068 6176 6520
    0x0080:  6265 656e 2064 726f 7070 6564
13:50:13.071006 f4:03:43:9b:a4:c8 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 232: (tos 0x0, ttl 10, id 61951, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.41.33434 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:3c:f3:00:04 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 182: (tos 0x0, ttl 64, id 16426, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.129.45903 > 10.26.0.17.51554: [udp sum ok] UDP, length 140
13:50:13.070991 f4:03:43:9b:a4:c8 > 20:67:7c:19:67:78, ethertype IPv4 (0x0800), length 339: (tos 0x0, ttl 10, id 61950, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.41.43183 > 10.71.231.42.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:3c:f3:00:04 > 02:00:1e:2d:00:02, ethertype IPv4 (0x0800), length 289: (tos 0x0, ttl 64, id 7504, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.21.129.57996 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 247
    Facility daemon (3), Severity info (6)

I think this is the key error:

    10.71.231.42.41486 > 10.71.231.43.8472: [bad udp cksum 0xe32f -> 0xe5e0!] OTV, flags [I] (0x08), overlay 0, instance 2884
02:00:1e:2d:00:02 > 02:00:46:bf:00:03, ethertype ARP (0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Reply 10.28.21.254 is-at 02:00:1e:2d:00:02, length 28

I only find some info on

Any idears?

xuanyuanaosheng commented 1 year ago

@weizhouapache: using vxlan2864 as an example:

# ip -d link show cloudbr0
7: cloudbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 20:67:7c:19:67:70 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.20:67:7c:19:67:70 designated_root 8000.20:67:7c:19:67:70 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   65.71 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

# ip -d link show cloudbr1
6: cloudbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 20:67:7c:19:67:78 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.20:67:7c:19:67:78 designated_root 8000.20:67:7c:19:67:78 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer   60.14 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

# ip -d link show vxlan2864 
76: vxlan2864: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master brvx-2864 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether f6:01:07:83:28:a9 brd ff:ff:ff:ff:ff:ff promiscuity 1 minmtu 68 maxmtu 65535 
    vxlan id 2864 group 239.0.11.48 dev cloudbr1 srcport 0 0 dstport 8472 ttl 10 ageing 300 udpcsum noudp6zerocsumtx noudp6zerocsumrx 
    bridge_slave state forwarding priority 32 cost 100 hairpin off guard off root_block off fastleave off learning on flood on port_id 0x8001 port_no 0x1 designated_port 32769 designated_cost 0 designated_bridge 8000.f6:1:7:83:28:a9 designated_root 8000.f6:1:7:83:28:a9 hold_timer    0.00 message_age_timer    0.00 forward_delay_timer    0.00 topology_change_ack 0 config_pending 0 proxy_arp off proxy_arp_wifi off mcast_router 1 mcast_fast_leave off mcast_flood on bcast_flood on mcast_to_unicast off neigh_suppress off group_fwd_mask 0 group_fwd_mask_str 0x0 vlan_tunnel off isolated off addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535

# ip -d link show brvx-2864
77: brvx-2864: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether f6:01:07:83:28:a9 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 68 maxmtu 65535 
    bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.f6:1:7:83:28:a9 designated_root 8000.f6:1:7:83:28:a9 root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer    0.00 tcn_timer    0.00 topology_change_timer    0.00 gc_timer  200.93 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535    

The vm on host: image

The net data stream: VM(mtu 1500) --> vnet38 (mtu 1450) --> vxlan2864 (mtu 1450) --> brvx-2864 (mtu 1450) --> cloudbr1 (mtu 1500)

I have using tcpdump -i any -s0 -nn -vvv port 8472 -w test3.pacp and wireshark:

image

I can not find the vxlan stream in cloudbr1

CloudStack uses VXLAN in advanced networking mode, is it necessary to configure the MTU values ?

weizhouapache commented 1 year ago

CloudStack uses VXLAN in advanced networking mode, is it necessary to configure the MTU values ?

@xuanyuanaosheng yes, please refer to https://docs.cloudstack.apache.org/en/latest/plugins/vxlan.html#important-note-on-mtu-size

xuanyuanaosheng commented 1 year ago

CloudStack uses VXLAN in advanced networking mode, is it necessary to configure the MTU values ?

@xuanyuanaosheng yes, please refer to https://docs.cloudstack.apache.org/en/latest/plugins/vxlan.html#important-note-on-mtu-size

@weizhouapache Thanks for you reply. Is the MTU must config on the switches and the physical server?

image

Our env: Two blade servers uplinked to two Cisco nexus 7700 core switches, with no mlag/lacp/vpc between them. But there are some other blade connected to the core switches, the blades are sharing the uplink, It is hard to config the switches MTU. So I need to check this.

weizhouapache commented 1 year ago

CloudStack uses VXLAN in advanced networking mode, is it necessary to configure the MTU values ?

@xuanyuanaosheng yes, please refer to https://docs.cloudstack.apache.org/en/latest/plugins/vxlan.html#important-note-on-mtu-size

@weizhouapache Thanks for you reply. Is the MTU must config on the switches and the physical server?

ideally yes. but if you just want to test it, you can change the mtu of interfaces in VMs (e.g. VRs) to 1450 or even 1400.

xuanyuanaosheng commented 1 year ago

@weizhouapache I have test it.

Now the vm on one blade enclosures can ping each other, But the vm on different blade enclosures can not ping each other.

image

image

I do not know where is the problem.

kiwiflyer commented 1 year ago

Linux native VXLAN uses multicast by default. You need to have an IP address on each of your host VXLAN subinterfaces that is within the same network.

Also, by default, Linux only configures enough IGMP memberships for 20 VXLAN networks.

Run this - echo 100 >/proc/sys/net/ipv4/igmp_max_memberships

You can make that change permanent by adding this to your sysctl.conf -

net.ipv4.igmp_max_memberships = 100

-Si

xuanyuanaosheng commented 1 year ago

Linux native VXLAN uses multicast by default. You need to have an IP address on each of your host VXLAN subinterfaces that is within the same network.

@kiwiflyer Yes, like: https://users.cloudstack.apache.narkive.com/0IkEWVAi/information-on-vxlan-implementations-and-other-guest-isolation-methods

I have config the IP address on host VXLAN subinterfaces using cloudbr1:


Using kvm003 as an example, the host network config,:

cat ifcfg-eno49

TYPE=Ethernet BOOTPROTO=none NAME=eno49 UUID=0650d63c-0244-4852-b0aa-ca5d8a64d8cb DEVICE=eno49 ONBOOT=yes

cat ifcfg-eno49.2128

NAME=eno49.2128 DEVICE=eno49.2128 ONBOOT=yes HOTPLUG=no BOOTPROTO=none VLAN=yes BRIDGE=cloudbr0

cat ifcfg-cloudbr0

NAME=cloudbr0 DEVICE=cloudbr0 TYPE=Bridge BOOTPROTO=none ONBOOT=yes IPADDR=10.26.128.25 GATEWAY=10.26.128.254 NETMASK=255.255.255.0 HOTPLUG=no DELAY=5 STP=no


cat ifcfg-eno50

TYPE=Ethernet BOOTPROTO=none NAME=eno50 UUID=46da1a8f-615e-4649-be64-fc8e1c7dd264 DEVICE=eno50 ONBOOT=yes

cat ifcfg-eno50.2230

NAME=eno50.2230 DEVICE=eno50.2230 ONBOOT=yes HOTPLUG=no BOOTPROTO=none VLAN=yes BRIDGE=cloudbr1

cat ifcfg-cloudbr1

NAME=cloudbr1 DEVICE=cloudbr1 TYPE=BRIDGE ONBOOT=yes BOOTPROTO=static IPADDR=10.71.231.41 NETMASK=255.255.255.0 IPV6INIT=no IPV6_AUTOCONF=no HOTPLUG=no DELAY=5 STP=no


The host can ping each other usinig cloudbr1. 

[root@kvm001 ~]# ping -I cloudbr1 10.71.231.41 PING 10.71.231.41 (10.71.231.41) from 10.71.231.42 cloudbr1: 56(84) bytes of data. 64 bytes from 10.71.231.41: icmp_seq=1 ttl=64 time=0.153 ms 64 bytes from 10.71.231.41: icmp_seq=2 ttl=64 time=0.147 ms 64 bytes from 10.71.231.41: icmp_seq=3 ttl=64 time=0.161 ms ^C --- 10.71.231.41 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2048ms rtt min/avg/max/mdev = 0.147/0.153/0.161/0.015 ms


[root@kvm002 ~]# ping -I cloudbr1 10.71.231.42 PING 10.71.231.42 (10.71.231.42) from 10.71.231.43 cloudbr1: 56(84) bytes of data. 64 bytes from 10.71.231.42: icmp_seq=1 ttl=64 time=0.245 ms 64 bytes from 10.71.231.42: icmp_seq=2 ttl=64 time=0.237 ms 64 bytes from 10.71.231.42: icmp_seq=3 ttl=64 time=0.244 ms ^C --- 10.71.231.42 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2083ms rtt min/avg/max/mdev = 0.237/0.242/0.245/0.003 ms


Is this network config  correct?

----

> Also, by default, Linux only configures enough IGMP memberships for 20 VXLAN networks.
> 
> Run this - echo 100 >/proc/sys/net/ipv4/igmp_max_memberships
> 
> You can make that change permanent by adding this to your sysctl.conf -
> 
> net.ipv4.igmp_max_memberships = 100

@kiwiflyer  This have  been done in our env
kiwiflyer commented 1 year ago

@xuanyuanaosheng Move your IPs to the VLAN interface that is encapsulating the VXLAN VIFs - eno50.2230

Also, make sure you have UDP 8472 open in iptables (i.e. iptables -I INPUT -p udp -m udp --dport 8472 -j ACCEPT) and that you're also allowing multicast across iptables.

xuanyuanaosheng commented 1 year ago

@kiwiflyer Thanks for you reply. I have configed the UDP 8472 and allowed multicast across iptables.

iptables -I INPUT -p udp -m udp --dport 8472 -j ACCEPT

iptables -A INPUT   -s 224.0.0.0/4 -j ACCEPT
iptables -A FORWARD -s 224.0.0.0/4 -d 224.0.0.0/4 -j ACCEPT
iptables -A OUTPUT  -d 224.0.0.0/4 -j ACCEPT

iptables-save > /etc/sysconfig/iptables

Now the iptables in kvm kvm001:

# cat /etc/sysconfig/iptables
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*filter
:INPUT ACCEPT [296573:1195510000]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300735:351478871]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWX - [0:0]
-A INPUT -p udp -m udp --dport 8472 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 49152:49216 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
-A INPUT -j LIBVIRT_INP
-A INPUT -p tcp -m tcp --dport 16514 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 16509 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -s 224.0.0.0/4 -j ACCEPT
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A FORWARD -s 224.0.0.0/4 -d 224.0.0.0/4 -j ACCEPT
-A OUTPUT -j LIBVIRT_OUT
-A OUTPUT -d 224.0.0.0/4 -j ACCEPT
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*security
:INPUT ACCEPT [368423:2248850292]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300741:351481031]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*raw
:PREROUTING ACCEPT [371554:2249164792]
:OUTPUT ACCEPT [300741:351481031]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*mangle
:PREROUTING ACCEPT [371554:2249164792]
:INPUT ACCEPT [368423:2248850292]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300741:351481031]
:POSTROUTING ACCEPT [300780:351485545]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023

---------------------------------------------------------

# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     udp  --  anywhere             anywhere             udp dpt:otv
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:49152:49216
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:rfb:synchronet-db
LIBVIRT_INP  all  --  anywhere             anywhere            
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16514
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16509
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh
ACCEPT     all  --  base-address.mcast.net/4  anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_FWX  all  --  anywhere             anywhere            
LIBVIRT_FWI  all  --  anywhere             anywhere            
LIBVIRT_FWO  all  --  anywhere             anywhere            
ACCEPT     all  --  base-address.mcast.net/4  base-address.mcast.net/4 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_OUT  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             base-address.mcast.net/4 

Chain LIBVIRT_INP (1 references)
target     prot opt source               destination         

Chain LIBVIRT_OUT (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWO (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWI (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWX (1 references)
target     prot opt source               destination

The iptables in kvm kvm002:

# cat /etc/sysconfig/iptables
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:30:39 2023
*filter
:INPUT ACCEPT [28:2372]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [126:14231]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWX - [0:0]
-A INPUT -p udp -m udp --dport 8472 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 49152:49216 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 16514 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 16509 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -j LIBVIRT_INP
-A INPUT -s 224.0.0.0/4 -j ACCEPT
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A FORWARD -s 224.0.0.0/4 -d 224.0.0.0/4 -j ACCEPT
-A OUTPUT -j LIBVIRT_OUT
-A OUTPUT -d 224.0.0.0/4 -j ACCEPT
COMMIT
# Completed on Mon Sep  4 12:30:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:30:39 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
COMMIT
# Completed on Mon Sep  4 12:30:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:30:39 2023
*mangle
:PREROUTING ACCEPT [424:32076]
:INPUT ACCEPT [165:11104]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [126:14231]
:POSTROUTING ACCEPT [126:14231]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
COMMIT
# Completed on Mon Sep  4 12:30:39 2023

--------------------------------------------------

# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     udp  --  anywhere             anywhere             udp dpt:otv
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:49152:49216
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:rfb:synchronet-db
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16514
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16509
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh
LIBVIRT_INP  all  --  anywhere             anywhere            
ACCEPT     all  --  base-address.mcast.net/4  anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_FWX  all  --  anywhere             anywhere            
LIBVIRT_FWI  all  --  anywhere             anywhere            
LIBVIRT_FWO  all  --  anywhere             anywhere            
ACCEPT     all  --  base-address.mcast.net/4  base-address.mcast.net/4 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_OUT  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             base-address.mcast.net/4 

Chain LIBVIRT_INP (1 references)
target     prot opt source               destination         

Chain LIBVIRT_OUT (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWO (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWI (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWX (1 references)
target     prot opt source               destination

The iptables in kvm kvm003:

# cat /etc/sysconfig/iptables
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:20:39 2023
*filter
:INPUT ACCEPT [585300:1902558523]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [646602:2058025419]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWX - [0:0]
-A INPUT -p udp -m udp --dport 8472 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 49152:49216 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 16514 -j ACCEPT
-A INPUT -j LIBVIRT_INP
-A INPUT -p tcp -m tcp --dport 16509 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -s 224.0.0.0/4 -j ACCEPT
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A FORWARD -s 224.0.0.0/4 -d 224.0.0.0/4 -j ACCEPT
-A OUTPUT -j LIBVIRT_OUT
-A OUTPUT -d 224.0.0.0/4 -j ACCEPT
COMMIT
# Completed on Mon Sep  4 12:20:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:20:39 2023
*security
:INPUT ACCEPT [587544:1902749897]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [646615:2058026885]
COMMIT
# Completed on Mon Sep  4 12:20:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:20:39 2023
*raw
:PREROUTING ACCEPT [596122:1903462439]
:OUTPUT ACCEPT [646615:2058026885]
COMMIT
# Completed on Mon Sep  4 12:20:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:20:39 2023
*mangle
:PREROUTING ACCEPT [596122:1903462439]
:INPUT ACCEPT [587544:1902749897]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [646615:2058026885]
:POSTROUTING ACCEPT [646654:2058031388]
COMMIT
# Completed on Mon Sep  4 12:20:39 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:20:39 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:LIBVIRT_PRT - [0:0]
-A POSTROUTING -j LIBVIRT_PRT
COMMIT
# Completed on Mon Sep  4 12:20:39 2023

----------------------------------------------------------------
# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     udp  --  anywhere             anywhere             udp dpt:otv
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:49152:49216
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:rfb:synchronet-db
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16514
LIBVIRT_INP  all  --  anywhere             anywhere            
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16509
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh
ACCEPT     all  --  base-address.mcast.net/4  anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_FWX  all  --  anywhere             anywhere            
LIBVIRT_FWI  all  --  anywhere             anywhere            
LIBVIRT_FWO  all  --  anywhere             anywhere            
ACCEPT     all  --  base-address.mcast.net/4  base-address.mcast.net/4 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_OUT  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             base-address.mcast.net/4 

Chain LIBVIRT_INP (1 references)
target     prot opt source               destination         

Chain LIBVIRT_OUT (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWO (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWI (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWX (1 references)
target     prot opt source               destination

Move your IPs to the VLAN interface that is encapsulating the VXLAN VIFs - eno50.2230

Does this mean?

# cat ifcfg-eno50
TYPE=Ethernet
BOOTPROTO=none
NAME=eno50
UUID=46da1a8f-615e-4649-be64-fc8e1c7dd264
DEVICE=eno50
ONBOOT=yes

# cat ifcfg-eno50.2230 
NAME=eno50.2230
DEVICE=eno50.2230
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=static
IPADDR=10.71.231.41
NETMASK=255.255.255.0
VLAN=yes
BRIDGE=cloudbr1

# cat ifcfg-cloudbr1
NAME=cloudbr1
DEVICE=cloudbr1
TYPE=BRIDGE
ONBOOT=yes
BOOTPROTO=none
IPV6INIT=no
IPV6_AUTOCONF=no
HOTPLUG=no
DELAY=5
STP=no

I have tested it, using this config, the eno50.2230 can not get the IP address 10.71.231.41. Is it my misunderstanding?

xuanyuanaosheng commented 1 year ago

I found some logs in /var/log/message:

Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) nic=[Nic:Guest-10.28.22.112-vxlan://2841]
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) creating a vNet dev and bridge for guest traffic per traffic label cloudbr1
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) Executing: /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvxlan.sh -v 2841 -p cloudbr1 -b brvx-2841 -o add
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) Executing while with timeout : 1800000
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0250] manager: (vxlan2841): new Vxlan device (/org/freedesktop/NetworkManager/Devices/20)
Sep  4 12:43:11 kvm002 systemd-udevd[8034]: Using default interface naming scheme 'rhel-8.0'.
Sep  4 12:43:11 kvm002 systemd-udevd[8034]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep  4 12:43:11 kvm002 systemd-udevd[8034]: Could not generate persistent MAC address for vxlan2841: No such file or directory
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0312] manager: (brvx-2841): new Bridge device (/org/freedesktop/NetworkManager/Devices/21)
Sep  4 12:43:11 kvm002 systemd-udevd[8039]: Using default interface naming scheme 'rhel-8.0'.
Sep  4 12:43:11 kvm002 systemd-udevd[8039]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep  4 12:43:11 kvm002 systemd-udevd[8039]: Could not generate persistent MAC address for brvx-2841: No such file or directory
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 1(vxlan2841) entered blocking state
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 1(vxlan2841) entered disabled state
Sep  4 12:43:11 kvm002 kernel: device vxlan2841 entered promiscuous mode
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 1(vxlan2841) entered blocking state
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 1(vxlan2841) entered forwarding state
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0389] device (vxlan2841): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0395] device (brvx-2841): carrier: link connected
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) Execution is successful.
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [kvm.resource.BridgeVifDriver] (agentRequest-Handler-3:) (logid:6eec626a) multicast 239.0.11.25 for VNI 2841 on cloudbr1
Sep  4 12:43:11 kvm002 java[3170]: vxlan: destination port not specified
Sep  4 12:43:11 kvm002 java[3170]: Will use Linux kernel default (non-standard value)
Sep  4 12:43:11 kvm002 java[3170]: Use 'dstport 4789' to get the IANA assigned value
Sep  4 12:43:11 kvm002 java[3170]: Use 'dstport 0' to get default and quiet this message
Sep  4 12:43:11 kvm002 java[3170]: RTNETLINK answers: File exists
Sep  4 12:43:11 kvm002 java[3170]: DEBUG [resource.wrapper.LibvirtStartCommandWrapper] (agentRequest-Handler-3:) (logid:6eec626a) starting 
i-2-30985-VM: <domain type='kvm'>
Sep  4 12:43:11 kvm002 java[3170]: <name>i-2-30985-VM</name>
......
Sep  4 12:43:11 kvm002 java[3170]: libvirt: QEMU Driver error : Domain not found: no domain with matching name 'i-2-30985-VM'
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0407] device (brvx-2841): state change: unmanaged -> unavailable (reason '
connection-assumed', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0412] device (brvx-2841): state change: unavailable -> disconnected (reaso
n 'connection-assumed', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0417] device (brvx-2841): Activation: starting connection 'brvx-2841' (a09
e56a4-685d-472a-abd1-8263356fa79f)
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0422] device (vxlan2841): state change: unavailable -> disconnected (reaso
n 'none', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0425] device (brvx-2841): state change: disconnected -> prepare (reason 'n
one', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0427] device (brvx-2841): state change: prepare -> config (reason 'none', 
sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0429] device (brvx-2841): state change: config -> ip-config (reason 'none'
, sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0437] device (vxlan2841): Activation: starting connection 'vxlan2841' (fa8
c5c20-c4b7-4dad-8ca3-c81e570e6820)
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0438] device (brvx-2841): state change: ip-config -> ip-check (reason 'non
e', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 dbus-daemon[2088]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.free
desktop.nm-dispatcher.service' requested by ':1.11' (uid=0 pid=2183 comm="/usr/sbin/NetworkManager --no-daemon ")
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0443] device (vxlan2841): state change: disconnected -> prepare (reason 'n
one', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0446] device (vxlan2841): state change: prepare -> config (reason 'none', 
sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0448] device (vxlan2841): state change: config -> ip-config (reason 'none'
, sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0449] device (brvx-2841): bridge port vxlan2841 was attached
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0449] device (vxlan2841): Activation: connection 'vxlan2841' enslaved, con
tinuing activation
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0450] device (vxlan2841): state change: ip-config -> ip-check (reason 'non
e', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 systemd[1]: Starting Network Manager Script Dispatcher Service...
Sep  4 12:43:11 kvm002 dbus-daemon[2088]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Sep  4 12:43:11 kvm002 systemd[1]: Started Network Manager Script Dispatcher Service.
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0546] device (brvx-2841): state change: ip-check -> secondaries (reason 'n
one', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0548] device (brvx-2841): state change: secondaries -> activated (reason '
none', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0552] device (brvx-2841): Activation: successful, device activated.
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0554] device (vxlan2841): state change: ip-check -> secondaries (reason 'n
one', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0556] device (vxlan2841): state change: secondaries -> activated (reason '
none', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.0559] device (vxlan2841): Activation: successful, device activated.
Sep  4 12:43:11 kvm002 systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Sep  4 12:43:11 kvm002 systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 2(vnet3) entered blocking state
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 2(vnet3) entered disabled state
Sep  4 12:43:11 kvm002 kernel: device vnet3 entered promiscuous mode
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 2(vnet3) entered blocking state
Sep  4 12:43:11 kvm002 kernel: brvx-2841: port 2(vnet3) entered forwarding state
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1090] manager: (vnet3): new Tun device (/org/freedesktop/NetworkManager/De
vices/22)
Sep  4 12:43:11 kvm002 systemd-udevd[8085]: Using default interface naming scheme 'rhel-8.0'.
Sep  4 12:43:11 kvm002 systemd-udevd[8085]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1160] device (vnet3): state change: unmanaged -> unavailable (reason 'conn
ection-assumed', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1164] device (vnet3): state change: unavailable -> disconnected (reason 'c
onnection-assumed', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1168] device (vnet3): Activation: starting connection 'vnet3' (bbc77cc5-4d
7f-4232-9a4f-3b2072c486c8)
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1170] device (vnet3): state change: disconnected -> prepare (reason 'none'
, sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1172] device (vnet3): state change: prepare -> config (reason 'none', sys-
iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1173] device (vnet3): state change: config -> ip-config (reason 'none', sy
s-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1174] device (brvx-2841): bridge port vnet3 was attached
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1174] device (vnet3): Activation: connection 'vnet3' enslaved, continuing 
activation
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1175] device (vnet3): state change: ip-config -> ip-check (reason 'none', 
sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1182] device (vnet3): state change: ip-check -> secondaries (reason 'none'
, sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1183] device (vnet3): state change: secondaries -> activated (reason 'none
', sys-iface-state: 'external')
Sep  4 12:43:11 kvm002 NetworkManager[2183]: <info>  [1693802591.1186] device (vnet3): Activation: successful, device activated.
Sep  4 12:43:11 kvm002 systemd[1]: iscsi.service: Unit cannot be reloaded because it is inactive.
Sep  4 12:43:11 kvm002 journal[3041]: Domain id=4 name='i-2-30985-VM' uuid=d951b297-6fbf-441f-a0aa-57ba8470cf73 is tainted: high-privileges
Sep  4 12:43:11 kvm002 systemd-machined[2063]: New machine qemu-4-i-2-30985-VM.
Sep  4 12:43:11 kvm002 systemd[1]: Started Virtual Machine qemu-4-i-2-30985-VM.
Sep  4 12:43:11 kvm002 kvm[8154]: 3 guests now active

@weizhouapache @kiwiflyer Please take a look?

kiwiflyer commented 1 year ago

@xuanyuanaosheng Can you assign the vlan directly to your physical ethernet interface, or to a bond?

Remove the bridge you have configured and give that a go.

xuanyuanaosheng commented 1 year ago

@kiwiflyer

Thanks for your reply.

I have followed your guidance and modified the network configuration to:

# cat ifcfg-eno1
TYPE=Ethernet
BOOTPROTO=none
NAME=eno1
UUID=a1420bd0-2cbe-45b4-b92e-7ba22aa148ef
DEVICE=eno1
ONBOOT=yes

# cat ifcfg-eno1.2128 
NAME=eno1.2128
DEVICE=eno1.2128
ONBOOT=yes
HOTPLUG=no
BOOTPROTO=none
VLAN=yes
BRIDGE=cloudbr0

# cat ifcfg-cloudbr0 
NAME=cloudbr0
DEVICE=cloudbr0
TYPE=BRIDGE
BOOTPROTO=none
ONBOOT=yes
IPADDR=10.26.128.22
GATEWAY=10.26.128.254
NETMASK=255.255.255.0
HOTPLUG=no
DELAY=5
STP=no

-------------------------------------------------------------------------------------
# cat ifcfg-eno2
TYPE=Ethernet
BOOTPROTO=none
NAME=eno2
UUID=d8d48df8-95f5-43af-afc5-433fc81f322e
DEVICE=eno2
ONBOOT=yes

# cat ifcfg-eno2.2230 
NAME=eno2.2230
DEVICE=eno2.2230
ONBOOT=yes
HOTPLUG=no
VLAN=yes
BOOTPROTO=static
IPADDR=10.71.231.42
NETMASK=255.255.255.0

Now the hosts network config:

                 |---------------- cloudbr0:  10.26.128.22 ( VLAN 2128)
    kvm001 ------
                 |---------------- eno2.2230:  10.71.231.42 ( VLAN 2230)

                 |---------------- cloudbr0:  10.26.128.23 ( VLAN 2128)
    kvm002 ------
                 |---------------- eno2.2230:  10.71.231.43 ( VLAN 2230)  

                 |---------------- cloudbr0:  10.26.128.25 ( VLAN 2128)
    kvm003 ------
                 |---------------- eno2.2230:  10.71.231.41 ( VLAN 2230)                   

Now the guest vxlan network is binded to a vlan NIC: eno2.2230 and the vlan interface should assigned a private IP (10.71.231.42, 10.71.231.41, 10.71.231.43) to do multicast with peer host. and they can ping each other using eno2.2230.

The zone set is image

The host can ping each other using eno2.2230:

[root@kvm001 ~]# ping -I eno2.2230 10.71.231.41
PING 10.71.231.41 (10.71.231.41) from 10.71.231.42 eno2.2230: 56(84) bytes of data.
64 bytes from 10.71.231.41: icmp_seq=1 ttl=64 time=0.161 ms
64 bytes from 10.71.231.41: icmp_seq=2 ttl=64 time=0.177 ms
64 bytes from 10.71.231.41: icmp_seq=3 ttl=64 time=0.178 ms
^C
--- 10.71.231.41 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2027ms
rtt min/avg/max/mdev = 0.161/0.172/0.178/0.007 ms
[root@kvm001 ~]# ping -I eno2.2230 10.71.231.43
PING 10.71.231.43 (10.71.231.43) from 10.71.231.42 eno2.2230: 56(84) bytes of data.
64 bytes from 10.71.231.43: icmp_seq=1 ttl=64 time=0.239 ms
64 bytes from 10.71.231.43: icmp_seq=2 ttl=64 time=0.221 ms
^C
--- 10.71.231.43 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1025ms
rtt min/avg/max/mdev = 0.221/0.230/0.239/0.009 ms

Now the host iptables on kvm001

# cat /etc/sysconfig/iptables
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*filter
:INPUT ACCEPT [296573:1195510000]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300735:351478871]
:LIBVIRT_INP - [0:0]
:LIBVIRT_OUT - [0:0]
:LIBVIRT_FWO - [0:0]
:LIBVIRT_FWI - [0:0]
:LIBVIRT_FWX - [0:0]
-A INPUT -p udp -m udp --dport 8472 -j ACCEPT
-A INPUT -p udp -m udp --dport 4789 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 49152:49216 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 5900:6100 -j ACCEPT
-A INPUT -j LIBVIRT_INP
-A INPUT -p tcp -m tcp --dport 16514 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 16509 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -p tcp -m tcp --dport 1798 -j ACCEPT
-A INPUT -s 224.0.0.0/4 -j ACCEPT
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A FORWARD -s 224.0.0.0/4 -d 224.0.0.0/4 -j ACCEPT
-A OUTPUT -j LIBVIRT_OUT
-A OUTPUT -d 224.0.0.0/4 -j ACCEPT
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*security
:INPUT ACCEPT [368423:2248850292]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300741:351481031]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*raw
:PREROUTING ACCEPT [371554:2249164792]
:OUTPUT ACCEPT [300741:351481031]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*mangle
:PREROUTING ACCEPT [371554:2249164792]
:INPUT ACCEPT [368423:2248850292]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [300741:351481031]
:POSTROUTING ACCEPT [300780:351485545]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023
# Generated by iptables-save v1.8.4 on Mon Sep  4 12:19:22 2023
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
COMMIT
# Completed on Mon Sep  4 12:19:22 2023

-------------------------------------------------------------------------------------
# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:etp
ACCEPT     udp  --  anywhere             anywhere             udp dpt:vxlan
ACCEPT     udp  --  anywhere             anywhere             udp dpt:otv
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:49152:49216
ACCEPT     tcp  --  anywhere             anywhere             tcp dpts:rfb:synchronet-db
LIBVIRT_INP  all  --  anywhere             anywhere            
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16514
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:16509
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh
ACCEPT     all  --  base-address.mcast.net/4  anywhere            

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_FWX  all  --  anywhere             anywhere            
LIBVIRT_FWI  all  --  anywhere             anywhere            
LIBVIRT_FWO  all  --  anywhere             anywhere            
ACCEPT     all  --  base-address.mcast.net/4  base-address.mcast.net/4 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
LIBVIRT_OUT  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             base-address.mcast.net/4 

Chain LIBVIRT_INP (1 references)
target     prot opt source               destination         

Chain LIBVIRT_OUT (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWO (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWI (1 references)
target     prot opt source               destination         

Chain LIBVIRT_FWX (1 references)
target     prot opt source               destination

According to the these configuration, the current situation is still the same as before.


The network data stream:

kvm001 (eno2.2230, 10.71.231.42)   <=======>  kvm002 (eno2.2230, 10.71.231.43) 

The VM network data stream:

VM: ubuntu221  (10.28.22.112, i-2-30985-VM, 02:00:25:45:00:01)  --> vnet6 (fe:00:25:45:00:01) --> brvx-2841 (86:a4:a4:8b:23:91) --> vxlan2841 (vxlan://2841, mtu 1450, vxlan id 2841 group 239.0.11.25 dev eno2.2230 srcport 0 0 dstport 8472 ttl 10 ageing 300 udpcsum) --> eno2.2230 (20:67:7c:19:67:78) --> eno2  <---------------------------->   eno2 -->  eno2.2230 (ac:16:2d:ab:e3:e4) -->  vxlan2841 (vxlan://2841, mtu 1450, vxlan id 2841 group 239.0.11.25 dev eno2.2230 srcport 0 0 dstport 8472 ttl 10 ageing 300 udpcsum) --> brvx-2841 (36:ef:97:61:d5:c5) --> vnet8 (fe:00:5c:64:00:03) -->  VM: ubuntu231  (10.28.22.19, i-2-30986-VM, 02:00:5c:64:00:03)

I do some tcpdump on the kvm001 interface:

The tcpdump result on kvm001 is:

# tcpdump -i vnet6  -s0  -nn -vvv 
dropped privs to tcpdump
tcpdump: listening on vnet6, link-type EN10MB (Ethernet), capture size 262144 bytes
15:55:54.266144 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:55:55.319656 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:55:56.343753 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:04.266472 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:05.303760 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:06.327631 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel

--------------------------------------------------------------------------------------------

# tcpdump -i brvx-2841  -s0  -nn -vvv 
dropped privs to tcpdump
tcpdump: listening on brvx-2841, link-type EN10MB (Ethernet), capture size 262144 bytes
15:56:24.266160 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:25.271682 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:26.295698 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:34.266367 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:35.319565 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:36.343746 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:44.266254 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:45.303644 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:46.327559 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:54.266462 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:55.287691 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:56:56.311641 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
^C
12 packets captured
12 packets received by filter
0 packets dropped by kernel

-----------------------------------------------------------------------------------------------

# tcpdump -i vxlan2841  -s0  -nn -vvv 
dropped privs to tcpdump
tcpdump: listening on vxlan2841, link-type EN10MB (Ethernet), capture size 262144 bytes
15:58:04.266350 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:05.303652 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:06.327539 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:14.266350 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:15.287671 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:16.311607 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:24.266230 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:25.271531 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
15:58:26.295662 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
^C
9 packets captured
9 packets received by filter
0 packets dropped by kernel

-----------------------------------------------------------------------------------------------------

# tcpdump -i eno2.2230  -s0  -nn -vvv 
dropped privs to tcpdump
tcpdump: listening on eno2.2230, link-type EN10MB (Ethernet), capture size 262144 bytes
15:58:47.000764 IP (tos 0x0, ttl 10, id 50852, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe83!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32142, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b6a!] UDP, length 8
15:58:47.020424 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.252.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
15:58:47.104169 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 18b6.00:de:fb:bb:23:41.8089, length 42
    message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 0, port-role Designated
15:58:47.239307 IP (tos 0x0, ttl 10, id 51088, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0x7d25!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18808, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x14978): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:47.803784 f4:03:43:00:c3:2d > 01:14:c2:44:1e:cc SNAP, oui Unknown (0x0014c2), pid Unknown (0x0001), length 58: 
    0x0000:  aaaa 0300 14c2 0001 0000 0000 0000 0000  ................
    0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0040:  0000                                     ..
15:58:48.000925 IP (tos 0x0, ttl 10, id 51614, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe82!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32264, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b69!] UDP, length 8
15:58:48.239568 IP (tos 0x0, ttl 10, id 51758, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0x8f98!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18809, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x14979): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:49.001019 IP (tos 0x0, ttl 10, id 52061, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe81!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32503, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b68!] UDP, length 8
15:58:49.031369 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.253.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
15:58:49.111227 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 18b6.00:de:fb:bb:23:41.8089, length 42
    message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 0, port-role Designated
15:58:49.239808 IP (tos 0x0, ttl 10, id 52161, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0xf7f0!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18810, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497a): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:49.934281 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.252.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
15:58:50.001091 IP (tos 0x0, ttl 10, id 52180, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe80!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32739, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b67!] UDP, length 8
15:58:50.240065 IP (tos 0x0, ttl 10, id 52266, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0xc4c7!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18811, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497b): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:50.725440 IP (tos 0x0, ttl 10, id 52715, offset 0, flags [none], proto UDP (17), length 130)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd041 -> 0x0c23!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32762, offset 0, flags [DF], proto UDP (17), length 80)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd7f -> 0x3961!] UDP, length 52
15:58:51.157149 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 18b6.00:de:fb:bb:23:41.8089, length 42
    message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 0, port-role Designated
15:58:51.240308 IP (tos 0x0, ttl 10, id 53076, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0xa036!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18812, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497c): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:51.725634 IP (tos 0x0, ttl 10, id 53537, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe7e!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 32975, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b65!] UDP, length 8
15:58:51.963509 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.253.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
15:58:52.240553 IP (tos 0x0, ttl 10, id 53582, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0xc2b9!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18813, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497d): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:52.725596 IP (tos 0x0, ttl 10, id 54044, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe7d!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 33041, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b64!] UDP, length 8
15:58:52.803983 f4:03:43:00:c3:2d > 01:14:c2:44:1e:cc SNAP, oui Unknown (0x0014c2), pid Unknown (0x0001), length 58: 
    0x0000:  aaaa 0300 14c2 0001 0000 0000 0000 0000  ................
    0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0040:  0000                                     ..
15:58:52.863116 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.252.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
15:58:53.134722 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 18b6.00:de:fb:bb:23:41.8089, length 42
    message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 0, port-role Designated
15:58:53.240808 IP (tos 0x0, ttl 10, id 54163, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0x155a!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18814, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497e): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:53.725750 IP (tos 0x0, ttl 10, id 54511, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.42.39295 > 239.0.10.246.8472: [bad udp cksum 0xd06d -> 0xfe7c!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 33070, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.228.55670 > 225.0.0.50.3780: [bad udp cksum 0xfd53 -> 0x2b63!] UDP, length 8
15:58:54.218217 IP (tos 0x0, ttl 10, id 10739, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.42.54808 > 239.0.11.6.8472: [bad udp cksum 0xebc3 -> 0xc614!] OTV, flags [I] (0x08), overlay 0, instance 2822
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.21.254 tell 10.28.21.136, length 28
15:58:54.241047 IP (tos 0x0, ttl 10, id 54529, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.42.34893 > 239.0.10.246.8472: [bad udp cksum 0xebd7 -> 0x8552!] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 18815, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.228 > 224.0.0.18: AH(spi=0x0a1c11e4,sumlen=16,seq=0x1497f): vrrp 10.28.17.228 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
15:58:54.266456 IP (tos 0x0, ttl 10, id 39369, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.42.54808 > 239.0.11.25.8472: [bad udp cksum 0xebd6 -> 0x5574!] OTV, flags [I] (0x08), overlay 0, instance 2841
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.112, length 28
^C
29 packets captured
29 packets received by filter
0 packets dropped by kernel

And some tcpdump on the kvm002 interface:

# tcpdump -i vnet8 -s0  -nn -vvv
dropped privs to tcpdump
tcpdump: listening on vnet8, link-type EN10MB (Ethernet), capture size 262144 bytes
16:34:16.483522 IP (tos 0x0, ttl 63, id 59045, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.19: ICMP echo request, id 10, seq 1, length 64
16:34:16.483856 IP (tos 0x0, ttl 64, id 12666, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.19 > 10.28.21.93: ICMP echo reply, id 10, seq 1, length 64
16:34:17.485097 IP (tos 0x0, ttl 63, id 59455, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.19: ICMP echo request, id 10, seq 2, length 64
16:34:17.485414 IP (tos 0x0, ttl 64, id 12886, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.19 > 10.28.21.93: ICMP echo reply, id 10, seq 2, length 64
16:34:18.486798 IP (tos 0x0, ttl 63, id 60058, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.19: ICMP echo request, id 10, seq 3, length 64
16:34:18.487142 IP (tos 0x0, ttl 64, id 12935, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.19 > 10.28.21.93: ICMP echo reply, id 10, seq 3, length 64
^C
6 packets captured
6 packets received by filter
0 packets dropped by kernel
[root@whdckvm023 ~]# clear
[root@whdckvm023 ~]# tcpdump -i vnet8 -s0  -nn -vvv
dropped privs to tcpdump
tcpdump: listening on vnet8, link-type EN10MB (Ethernet), capture size 262144 bytes
16:35:18.138149 IP (tos 0x0, ttl 64, id 10888, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.22.19.43632 > 10.25.28.25.123: [bad udp cksum 0x46aa -> 0xedc7!] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 7 (128s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   302219027.959372496 (2045/09/05 12:12:03)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 302219027.959372496 (2045/09/05 12:12:03)
16:35:18.139203 IP (tos 0xc0, ttl 60, id 16152, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.25.123 > 10.28.22.19.43632: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 7 (128s), precision -24
    Root Delay: 0.077178, Root dispersion: 0.041107, Reference-ID: 58.176.194.96
      Reference Timestamp:  3902977303.263736875 (2023/09/06 16:21:43)
      Originator Timestamp: 302219027.959372496 (2045/09/05 12:12:03)
      Receive Timestamp:    3902978118.136953431 (2023/09/06 16:35:18)
      Transmit Timestamp:   3902978118.137003673 (2023/09/06 16:35:18)
        Originator - Receive Timestamp:  -694208205.822419065
        Originator - Transmit Timestamp: -694208205.822368822
16:35:23.342579 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.19 tell 10.28.22.254, length 28
16:35:23.342876 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.19 is-at 02:00:5c:64:00:03, length 28
16:35:23.598980 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.19, length 28
16:35:23.599291 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28
16:35:53.372623 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:54.382540 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:55.406539 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:56.438136 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:57.454568 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:58.478553 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:35:59.510178 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:36:00.526579 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
^C
14 packets captured
14 packets received by filter
0 packets dropped by kernel

# tcpdump -i brvx-2841 -s0  -nn -vvv
dropped privs to tcpdump
tcpdump: listening on brvx-2841, link-type EN10MB (Ethernet), capture size 262144 bytes
16:37:53.889721 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:37:54.894437 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:37:55.918430 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.112 tell 10.28.22.254, length 28
16:38:19.595033 IP (tos 0x0, ttl 64, id 32230, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.22.162.52808 > 10.25.28.25.123: [udp sum ok] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 8 (256s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   211238355.308955056 (2042/10/18 11:47:31)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 211238355.308955056 (2042/10/18 11:47:31)
16:38:19.596123 IP (tos 0xc0, ttl 60, id 35783, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.25.123 > 10.28.22.162.52808: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 8 (256s), precision -24
    Root Delay: 0.077178, Root dispersion: 0.043823, Reference-ID: 58.176.194.96
      Reference Timestamp:  3902977303.263736875 (2023/09/06 16:21:43)
      Originator Timestamp: 211238355.308955056 (2042/10/18 11:47:31)
      Receive Timestamp:    3902978299.593946819 (2023/09/06 16:38:19)
      Transmit Timestamp:   3902978299.594040768 (2023/09/06 16:38:19)
        Originator - Receive Timestamp:  -603227351.715008236
        Originator - Transmit Timestamp: -603227351.714914287
16:38:22.277076 IP (tos 0x0, ttl 63, id 54611, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.162: ICMP echo request, id 21, seq 1, length 64
16:38:22.277566 IP (tos 0x0, ttl 64, id 1593, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.162 > 10.28.21.93: ICMP echo reply, id 21, seq 1, length 64
16:38:22.579158 IP (tos 0x0, ttl 64, id 50569, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.22.162.45475 > 10.25.28.26.123: [udp sum ok] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 9 (512s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   1713798433.225742136 (2090/05/30 05:35:29)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 1713798433.225742136 (2090/05/30 05:35:29)
16:38:22.580535 IP (tos 0xc0, ttl 60, id 56414, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.26.123 > 10.28.22.162.45475: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 9 (512s), precision -23
    Root Delay: 0.081512, Root dispersion: 0.101760, Reference-ID: 58.176.194.96
      Reference Timestamp:  3902973479.155017065 (2023/09/06 15:17:59)
      Originator Timestamp: 1713798433.225742136 (2090/05/30 05:35:29)
      Receive Timestamp:    3902978302.582223885 (2023/09/06 16:38:22)
      Transmit Timestamp:   3902978302.582322704 (2023/09/06 16:38:22)
        Originator - Receive Timestamp:  -2105787426.643518251
        Originator - Transmit Timestamp: -2105787426.643419431
16:38:23.279047 IP (tos 0x0, ttl 63, id 54628, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.162: ICMP echo request, id 21, seq 2, length 64
16:38:23.279552 IP (tos 0x0, ttl 64, id 2395, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.162 > 10.28.21.93: ICMP echo reply, id 21, seq 2, length 64
16:38:24.280965 IP (tos 0x0, ttl 63, id 54688, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.162: ICMP echo request, id 21, seq 3, length 64
16:38:24.281499 IP (tos 0x0, ttl 64, id 2906, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.162 > 10.28.21.93: ICMP echo reply, id 21, seq 3, length 64
16:38:24.846398 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.162 tell 10.28.22.254, length 28
16:38:24.846920 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.162 is-at 02:00:16:c7:00:04, length 28
16:38:25.057006 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.162, length 28
16:38:25.057425 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28
16:38:25.282982 IP (tos 0x0, ttl 63, id 54729, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.162: ICMP echo request, id 21, seq 4, length 64
16:38:25.283430 IP (tos 0x0, ttl 64, id 3879, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.162 > 10.28.21.93: ICMP echo reply, id 21, seq 4, length 64
16:38:26.284118 IP (tos 0x0, ttl 63, id 54963, offset 0, flags [DF], proto ICMP (1), length 84)
    10.28.21.93 > 10.28.22.162: ICMP echo request, id 21, seq 5, length 64
16:38:26.284581 IP (tos 0x0, ttl 64, id 4790, offset 0, flags [none], proto ICMP (1), length 84)
    10.28.22.162 > 10.28.21.93: ICMP echo reply, id 21, seq 5, length 64
16:39:36.557935 IP (tos 0x0, ttl 64, id 33653, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.22.19.39931 > 10.25.28.25.123: [bad udp cksum 0x46aa -> 0x47aa!] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 7 (128s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   2098731277.972322213 (2102/08/11 11:22:53)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 2098731277.972322213 (2102/08/11 11:22:53)
16:39:36.558870 IP (tos 0xc0, ttl 60, id 16030, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.25.123 > 10.28.22.19.39931: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 7 (128s), precision -24
    Root Delay: 0.077178, Root dispersion: 0.044982, Reference-ID: 58.176.194.96
      Reference Timestamp:  3902977303.263736875 (2023/09/06 16:21:43)
      Originator Timestamp: 2098731277.972322213 (2102/08/11 11:22:53)
      Receive Timestamp:    3902978376.556784587 (2023/09/06 16:39:36)
      Transmit Timestamp:   3902978376.556845303 (2023/09/06 16:39:36)
        Originator - Receive Timestamp:  +1804247098.584462374
        Originator - Transmit Timestamp: +1804247098.584523089
16:39:41.646316 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.19 tell 10.28.22.254, length 28
16:39:41.646617 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.19 is-at 02:00:5c:64:00:03, length 28
16:39:41.647085 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.19, length 28
16:39:41.647286 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28

# tcpdump -i vxlan2841 -s0  -nn -vvv
dropped privs to tcpdump
tcpdump: listening on vxlan2841, link-type EN10MB (Ethernet), capture size 262144 bytes
16:42:36.464219 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28
16:42:37.096209 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28
16:42:39.905279 IP (tos 0x0, ttl 64, id 14670, offset 0, flags [DF], proto UDP (17), length 76)
    10.28.22.162.51744 > 10.25.28.25.123: [udp sum ok] NTPv4, length 48
    Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 8 (256s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
      Reference Timestamp:  0.000000000
      Originator Timestamp: 0.000000000
      Receive Timestamp:    0.000000000
      Transmit Timestamp:   2444026421.230117447 (1977/06/13 16:13:41)
        Originator - Receive Timestamp:  0.000000000
        Originator - Transmit Timestamp: 2444026421.230117447 (1977/06/13 16:13:41)
16:42:39.906768 IP (tos 0xc0, ttl 60, id 40370, offset 0, flags [DF], proto UDP (17), length 76)
    10.25.28.25.123 > 10.28.22.162.51744: [udp sum ok] NTPv4, length 48
    Server, Leap indicator:  (0), Stratum 2 (secondary reference), poll 8 (256s), precision -24
    Root Delay: 0.077178, Root dispersion: 0.047744, Reference-ID: 58.176.194.96
      Reference Timestamp:  3902977303.263736875 (2023/09/06 16:21:43)
      Originator Timestamp: 2444026421.230117447 (1977/06/13 16:13:41)
      Receive Timestamp:    3902978559.904500232 (2023/09/06 16:42:39)
      Transmit Timestamp:   3902978559.904608152 (2023/09/06 16:42:39)
        Originator - Receive Timestamp:  +1458952138.674382785
        Originator - Transmit Timestamp: +1458952138.674490704
16:42:44.942246 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.162 tell 10.28.22.254, length 28
16:42:44.943162 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.162 is-at 02:00:16:c7:00:04, length 28
16:42:45.152915 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.22.254 tell 10.28.22.162, length 28
16:42:45.153300 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.22.254 is-at 02:00:48:ad:00:0d, length 28

# tcpdump -i eno2.2230 -s0  -nn -vvv
dropped privs to tcpdump
tcpdump: listening on eno2.2230, link-type EN10MB (Ethernet), capture size 262144 bytes
16:44:05.401408 IP (tos 0x0, ttl 10, id 58202, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.41.42218 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 36442, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.129.54588 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 140
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 debian251 telegraf[876]: 2023-09-06T08:44:05Z W! [outputs.influxdb] Metric buffer overflow; 31 metrics have been dropped
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 6465 6269 616e 3235 3120 7465
    0x0020:  6c65 6772 6166 5b38 3736 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 5721 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 4d65 7472 6963 2062
    0x0060:  7566 6665 7220 6f76 6572 666c 6f77 3b20
    0x0070:  3331 206d 6574 7269 6373 2068 6176 6520
    0x0080:  6265 656e 2064 726f 7070 6564
16:44:05.401463 IP (tos 0x0, ttl 10, id 58203, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.41.58684 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 29245, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.21.129.37321 > 10.26.0.17.51554: [udp sum ok] UDP, length 140
16:44:05.401473 IP (tos 0x0, ttl 10, id 58204, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.41.42218 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 36443, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.21.129.54588 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 247
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 debian251 telegraf[876]: 2023-09-06T08:44:05Z E! [outputs.influxdb] When writing to [http://localhost:8086]: failed doing req: Post "http://localhost:8086/write?db=telegraf": dial tcp 127.0.0.1:8086: connect: connection refused
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 6465 6269 616e 3235 3120 7465
    0x0020:  6c65 6772 6166 5b38 3736 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 4521 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 5768 656e 2077 7269
    0x0060:  7469 6e67 2074 6f20 5b68 7474 703a 2f2f
    0x0070:  6c6f 6361 6c68 6f73 743a 3830 3836 5d3a
    0x0080:  2066 6169 6c65 6420 646f 696e 6720 7265
    0x0090:  713a 2050 6f73 7420 2268 7474 703a 2f2f
    0x00a0:  6c6f 6361 6c68 6f73 743a 3830 3836 2f77
    0x00b0:  7269 7465 3f64 623d 7465 6c65 6772 6166
    0x00c0:  223a 2064 6961 6c20 7463 7020 3132 372e
    0x00d0:  302e 302e 313a 3830 3836 3a20 636f 6e6e
    0x00e0:  6563 743a 2063 6f6e 6e65 6374 696f 6e20
    0x00f0:  7265 6675 7365 64
16:44:05.401480 IP (tos 0x0, ttl 10, id 58205, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.41.42218 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 36444, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.21.129.54588 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 139
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 debian251 telegraf[876]: 2023-09-06T08:44:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 6465 6269 616e 3235 3120 7465
    0x0020:  6c65 6772 6166 5b38 3736 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 4521 205b 6167 656e 745d 2045 7272
    0x0050:  6f72 2077 7269 7469 6e67 2074 6f20 6f75
    0x0060:  7470 7574 732e 696e 666c 7578 6462 3a20
    0x0070:  636f 756c 6420 6e6f 7420 7772 6974 6520
    0x0080:  616e 7920 6164 6472 6573 73
16:44:05.401487 IP (tos 0x0, ttl 10, id 58206, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.41.58684 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 29246, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.21.129.37321 > 10.26.0.17.51554: [udp sum ok] UDP, length 247
16:44:05.401494 IP (tos 0x0, ttl 10, id 58207, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.41.58684 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2822
IP (tos 0x0, ttl 64, id 29247, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.21.129.37321 > 10.26.0.17.51554: [udp sum ok] UDP, length 139
16:44:05.441905 IP (tos 0x0, ttl 10, id 52761, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.43.59462 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 27703, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.17.112.47487 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 140
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 tecent231 telegraf[865]: 2023-09-06T08:44:05Z W! [outputs.influxdb] Metric buffer overflow; 31 metrics have been dropped
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 7465 6365 6e74 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3635 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 5721 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 4d65 7472 6963 2062
    0x0060:  7566 6665 7220 6f76 6572 666c 6f77 3b20
    0x0070:  3331 206d 6574 7269 6373 2068 6176 6520
    0x0080:  6265 656e 2064 726f 7070 6564
16:44:05.441990 IP (tos 0x0, ttl 10, id 52762, offset 0, flags [none], proto UDP (17), length 218)
    10.71.231.43.57866 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 64966, offset 0, flags [DF], proto UDP (17), length 168)
    10.28.17.112.51029 > 10.26.0.17.51554: [udp sum ok] UDP, length 140
16:44:05.442093 IP (tos 0x0, ttl 10, id 52763, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.43.59462 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 27704, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.17.112.47487 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 247
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 tecent231 telegraf[865]: 2023-09-06T08:44:05Z E! [outputs.influxdb] When writing to [http://localhost:8086]: failed doing req: Post "http://localhost:8086/write?db=telegraf": dial tcp 127.0.0.1:8086: connect: connection refused
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 7465 6365 6e74 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3635 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 4521 205b 6f75 7470 7574 732e 696e
    0x0050:  666c 7578 6462 5d20 5768 656e 2077 7269
    0x0060:  7469 6e67 2074 6f20 5b68 7474 703a 2f2f
    0x0070:  6c6f 6361 6c68 6f73 743a 3830 3836 5d3a
    0x0080:  2066 6169 6c65 6420 646f 696e 6720 7265
    0x0090:  713a 2050 6f73 7420 2268 7474 703a 2f2f
    0x00a0:  6c6f 6361 6c68 6f73 743a 3830 3836 2f77
    0x00b0:  7269 7465 3f64 623d 7465 6c65 6772 6166
    0x00c0:  223a 2064 6961 6c20 7463 7020 3132 372e
    0x00d0:  302e 302e 313a 3830 3836 3a20 636f 6e6e
    0x00e0:  6563 743a 2063 6f6e 6e65 6374 696f 6e20
    0x00f0:  7265 6675 7365 64
16:44:05.442117 IP (tos 0x0, ttl 10, id 52764, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.43.59462 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 27705, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.17.112.47487 > 10.25.9.129.514: [udp sum ok] SYSLOG, length: 139
    Facility daemon (3), Severity info (6)
    Msg: Sep  6 16:44:05 tecent231 telegraf[865]: 2023-09-06T08:44:05Z E! [agent] Error writing to outputs.influxdb: could not write any address
    0x0000:  3c33 303e 5365 7020 2036 2031 363a 3434
    0x0010:  3a30 3520 7465 6365 6e74 3233 3120 7465
    0x0020:  6c65 6772 6166 5b38 3635 5d3a 2032 3032
    0x0030:  332d 3039 2d30 3654 3038 3a34 343a 3035
    0x0040:  5a20 4521 205b 6167 656e 745d 2045 7272
    0x0050:  6f72 2077 7269 7469 6e67 2074 6f20 6f75
    0x0060:  7470 7574 732e 696e 666c 7578 6462 3a20
    0x0070:  636f 756c 6420 6e6f 7420 7772 6974 6520
    0x0080:  616e 7920 6164 6472 6573 73
16:44:05.442154 IP (tos 0x0, ttl 10, id 52765, offset 0, flags [none], proto UDP (17), length 325)
    10.71.231.43.57866 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 64967, offset 0, flags [DF], proto UDP (17), length 275)
    10.28.17.112.51029 > 10.26.0.17.51554: [udp sum ok] UDP, length 247
16:44:05.442169 IP (tos 0x0, ttl 10, id 52766, offset 0, flags [none], proto UDP (17), length 217)
    10.71.231.43.57866 > 10.71.231.41.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 64, id 64968, offset 0, flags [DF], proto UDP (17), length 167)
    10.28.17.112.51029 > 10.26.0.17.51554: [udp sum ok] UDP, length 139
16:44:05.706511 IP (tos 0x0, ttl 10, id 19970, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.41.39171 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 21784, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.43 > 224.0.0.18: AH(spi=0x0a1c112b,sumlen=16,seq=0x15518): vrrp 10.28.17.43 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
16:44:05.986196 2c:27:d7:16:38:59 > 01:14:c2:44:1e:cc SNAP, oui Unknown (0x0014c2), pid Unknown (0x0001), length 58: 
    0x0000:  aaaa 0300 14c2 0001 0000 0000 0000 0000  ................
    0x0010:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0020:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0030:  0000 0000 0000 0000 0000 0000 0000 0000  ................
    0x0040:  0000                                     ..
16:44:06.123473 IP (tos 0x0, ttl 10, id 20101, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.41.46358 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 43471, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.43.48787 > 225.0.0.50.3780: [udp sum ok] UDP, length 8
16:44:06.328271 IP (tos 0x0, ttl 10, id 59058, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.41.50771 > 10.71.231.43.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.28.17.112 tell 10.28.17.43, length 28
16:44:06.328705 IP (tos 0x0, ttl 10, id 53202, offset 0, flags [none], proto UDP (17), length 78)
    10.71.231.43.34259 > 10.71.231.41.8472: [bad udp cksum 0xe32e -> 0x1def!] OTV, flags [I] (0x08), overlay 0, instance 2806
ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.28.17.112 is-at 02:00:57:59:00:04, length 28
16:44:06.347588 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.252.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
16:44:06.514486 IP (tos 0x0, ttl 10, id 20241, offset 0, flags [none], proto UDP (17), length 138)
    10.71.231.41.46358 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 43524, offset 0, flags [DF], proto UDP (17), length 88)
    10.28.17.43.48787 > 225.0.0.50.3780: [udp sum ok] UDP, length 60
16:44:06.514842 IP (tos 0x0, ttl 10, id 20242, offset 0, flags [none], proto UDP (17), length 138)
    10.71.231.41.46358 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 43525, offset 0, flags [DF], proto UDP (17), length 88)
    10.28.17.43.48787 > 225.0.0.50.3780: [udp sum ok] UDP, length 60
16:44:06.536383 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 28b6.00:de:fb:bb:15:c1.80a1, length 42
    message-age 1.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 1, port-role Designated
16:44:06.706761 IP (tos 0x0, ttl 10, id 20369, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.41.39171 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 21785, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.43 > 224.0.0.18: AH(spi=0x0a1c112b,sumlen=16,seq=0x15519): vrrp 10.28.17.43 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
16:44:07.515059 IP (tos 0x0, ttl 10, id 20833, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.41.46358 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 43762, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.43.48787 > 225.0.0.50.3780: [udp sum ok] UDP, length 8
16:44:07.544605 IP (tos 0xc0, ttl 255, id 0, offset 0, flags [none], proto UDP (17), length 80)
    10.71.231.253.1985 > 224.0.0.102.1985: [udp sum ok] HSRPv1
16:44:07.706989 IP (tos 0x0, ttl 10, id 20946, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.41.39171 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 21786, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.43 > 224.0.0.18: AH(spi=0x0a1c112b,sumlen=16,seq=0x1551a): vrrp 10.28.17.43 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
16:44:08.515227 IP (tos 0x0, ttl 10, id 21504, offset 0, flags [none], proto UDP (17), length 86)
    10.71.231.41.46358 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0x0, ttl 1, id 43921, offset 0, flags [DF], proto UDP (17), length 36)
    10.28.17.43.48787 > 225.0.0.50.3780: [udp sum ok] UDP, length 8
16:44:08.561422 STP 802.1w, Rapid STP, Flags [Learn, Forward], bridge-id 28b6.00:de:fb:bb:15:c1.80a1, length 42
    message-age 1.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay 15.00s
    root-id 18b6.00:de:fb:bb:23:41, root-pathcost 1, port-role Designated
16:44:08.707216 IP (tos 0x0, ttl 10, id 21664, offset 0, flags [none], proto UDP (17), length 114)
    10.71.231.41.39171 > 239.0.10.246.8472: [udp sum ok] OTV, flags [I] (0x08), overlay 0, instance 2806
IP (tos 0xc0, ttl 255, id 21787, offset 0, flags [none], proto AH (51), length 64)
    10.28.17.43 > 224.0.0.18: AH(spi=0x0a1c112b,sumlen=16,seq=0x1551b): vrrp 10.28.17.43 > 224.0.0.18: VRRPv2, Advertisement, vrid 51, prio 100, authtype ah, intvl 1s, length 20, addrs: 10.28.17.254
^C
28 packets captured
28 packets received by filter
0 packets dropped by kernel

Any idears?

kiwiflyer commented 1 year ago

Do you have PIM enabled on your switches? If so, disable it and check for any other multicast configuration.

xuanyuanaosheng commented 1 year ago

@kiwiflyer Thanks for you reply.

This is indeed a multicast configuration, We close the igmp snooping on the switches, and it works.

I will close this issue.

xuanyuanaosheng commented 1 year ago

Our network said: By default, Cisco only forwards multicast traffic in the 224.0.0.0/24 subnet.