kubeovn / kube-ovn

A Bridge between SDN and Cloud Native (Project under CNCF)
https://kubeovn.github.io/docs/stable/en/
Apache License 2.0
1.96k stars 443 forks source link

1.11.3 underlay模式安装失败,pod无法正常通信 #2803

Closed Git4Mark closed 1 year ago

Git4Mark commented 1 year ago

Expected Behavior

kube-ovn underlay正常工作

Actual Behavior

kube-ovn underlay模式,pod无法访问service,pod无法访问宿主机(除了pod所在宿主机)

Steps to Reproduce the Problem

1.kubeadm安装k8s,3master,2worker,keepalive+haproxy高可用 2.kube-ovn1.11.3安装underlay模式,安装过程卡在coredns重建,长时间超时退出 3.观察coredns日志发现pod无法访问service,测试网络发现pod访问service ip和宿主机ip均不通 4.卸载kube-ovn重装overlay模式,网络正常 5.卸载kube-ovn(官方脚本+etcd清理)重装underlay模式,pod无法访问service,宿主机ip,宿主机无法访问其它节点上的pod 6.抓包发现pod流量经过了geneve封装(多次卸载重装均是这个结果)

Additional Info

宿主机为vCenter创建的5台虚拟机

ovn安装参数

set -euo pipefail

IPV6=${IPV6:-false}
DUAL_STACK=${DUAL_STACK:-false}
ENABLE_SSL=${ENABLE_SSL:-false}
ENABLE_VLAN=${ENABLE_VLAN:-true}
CHECK_GATEWAY=${CHECK_GATEWAY:-true}
LOGICAL_GATEWAY=${LOGICAL_GATEWAY:-true}
U2O_INTERCONNECTION=${U2O_INTERCONNECTION:-false}
ENABLE_MIRROR=${ENABLE_MIRROR:-true}
VLAN_NIC=${VLAN_NIC:-myeth0}
HW_OFFLOAD=${HW_OFFLOAD:-false}
ENABLE_LB=${ENABLE_LB:-true}
ENABLE_NP=${ENABLE_NP:-true}
ENABLE_EIP_SNAT=${ENABLE_EIP_SNAT:-false}
LS_DNAT_MOD_DL_DST=${LS_DNAT_MOD_DL_DST:-true}
ENABLE_EXTERNAL_VPC=${ENABLE_EXTERNAL_VPC:-true}
CNI_CONFIG_PRIORITY=${CNI_CONFIG_PRIORITY:-01}
ENABLE_LB_SVC=${ENABLE_LB_SVC:-false}
ENABLE_KEEP_VM_IP=${ENABLE_KEEP_VM_IP:-true}
# exchange link names of OVS bridge and the provider nic
# in the default provider-network
EXCHANGE_LINK_NAME=${EXCHANGE_LINK_NAME:-false}
# The nic to support container network can be a nic name or a group of regex
# separated by comma, if empty will use the nic that the default route use
IFACE=${IFACE:-}
# Specifies the name of the dpdk tunnel iface.
# Note that the dpdk tunnel iface and tunnel ip cidr should be diffierent with Kubernetes api cidr,otherwise the route will be a problem.
DPDK_TUNNEL_IFACE=${DPDK_TUNNEL_IFACE:-br-phy}
ENABLE_BIND_LOCAL_IP=${ENABLE_BIND_LOCAL_IP:-true}

CNI_CONF_DIR="/etc/cni/net.d"
CNI_BIN_DIR="/opt/cni/bin"

REGISTRY="harbor.comstar:8443/ccc/kubeovn"
VERSION="v1.11.1"
IMAGE_PULL_POLICY="IfNotPresent"
POD_CIDR="10.50.0.0/18"                # Do NOT overlap with NODE/SVC/JOIN CIDR
POD_GATEWAY="10.50.0.1"
SVC_CIDR="10.50.128.0/18"                # Do NOT overlap with NODE/POD/JOIN CIDR
JOIN_CIDR="10.50.64.0/18"              # Do NOT overlap with NODE/POD/SVC CIDR
PINGER_EXTERNAL_ADDRESS="114.114.114.114"  # Pinger check external ip probe
PINGER_EXTERNAL_DOMAIN="alauda.cn"         # Pinger check external domain probe
SVC_YAML_IPFAMILYPOLICY=""

EXCLUDE_IPS=""                                    # EXCLUDE_IPS for default subnet
LABEL="node-role.kubernetes.io/control-plane"     # The node label to deploy OVN DB
DEPRECATED_LABEL="node-role.kubernetes.io/master" # The node label to deploy OVN DB in earlier versions
NETWORK_TYPE="geneve"
TUNNEL_TYPE="geneve"                              # geneve, vxlan or stt. ATTENTION: some networkpolicy cannot take effect when using vxlan and stt need custom compile ovs kernel module
POD_NIC_TYPE="veth-pair"                          # veth-pair or internal-port
POD_DEFAULT_FIP_TYPE=""                           # iptables, pod can set iptables fip automatically by enable fip annotation

# VLAN Config only take effect when NETWORK_TYPE is vlan
PROVIDER_NAME="provider"
VLAN_INTERFACE_NAME=""
VLAN_NAME="ovn-vlan"
VLAN_ID="0"
zhangzujian commented 1 year ago

脚本中通过 VLAN_INTERFACE_NAME 制定要使用的节点网卡。

zhangzujian commented 1 year ago

使用前请仔细阅读文档:Underlay 网络安装

Git4Mark commented 1 year ago

VLAN_NIC=${VLAN_NIC:-myeth0} 使用这个参数是一样的吧,我看安装脚本逻辑也是读的这个变量

zhangzujian commented 1 year ago

文档中有环境要求,先检查一下是否满足要求。问题原因应该是 Pod 无法和节点通信,需要检查物理网络的配置。

Git4Mark commented 1 year ago

检查了下,发现是虚拟平台的vswitch网络没开启混杂模式导致的,已解决,不过underlay组网失败时为啥感觉走了geneve隧道呢,是专门做的判断吗,underlay不通时直接走隧道?