Closed tarun28jain closed 4 years ago
Can you run kubectl ko sbctl show
to check if all nodes has been register to ovn-sb
root@k8s-master:~# kubectl ko sbctl show Chassis "" hostname: k8s-minion3 Encap geneve ip: "minion3-ip" options: {csum="true"}
root@k8s2-master:~# kubectl ko sbctl show Chassis "d007e683-f557-4a19-a81b-02ef1822c1b2" hostname: k8s2-master Encap geneve ip: "master-ip" options: {csum="true"} Port_Binding kube-ovn-pinger-k2ng9.kube-system Port_Binding node-k8s2-master Chassis "7d41af13-d0c3-42e1-ab56-74d85e7b5c46" hostname: k8s2-minion Encap geneve ip: "minion-ip" options: {csum="true"} Port_Binding coredns-86c58d9df4-zxvbk.kube-system Port_Binding kube-ovn-pinger-dhg5j.kube-system Port_Binding node-k8s2-minion Chassis "9f840e82-578c-489b-9bd8-ac64cde2f1ee" hostname: k8s2-minion3 Encap geneve ip: "minion3-ip" options: {csum="true"} Port_Binding node-k8s2-minion3 Port_Binding coredns-86c58d9df4-wrqmr.kube-system Port_Binding kube-ovn-pinger-d8hwc.kube-system Chassis "b6bfb4bc-622f-4152-be7b-c3b26184d436" hostname: k8s2-minion2 Encap geneve ip: "minion2-ip" options: {csum="true"} Port_Binding kube-ovn-pinger-7mgc9.kube-system Port_Binding node-k8s2-minion2
It seems that 3 nodes in DPDK mode didn't connect to ovn-sb. Can you check the /var/run/ovn/ovn-controller.log
to verify if it connected to the right ovn-sb endpoint
I am also facing the same issue with Kube-OVN with DPDK enabled Single Node K8s setup.
OS: Ubuntu18 Virtual Machines over openstack with ovs-dpdk RAM: 16GB Cores: 8 Kubernetes: 1.13.5 (installed via kubeadm) Kubeovn with DPDK: v1.2.1
**Installation Logs:**
# ./kubeovn_install_13.sh --with-dpdk=19.11
[Step 1] Label kube-ovn-master node
node/k8s-cmk not labeled
node/k8s-cmk labeled
[Step 2] Install OVN components
Install OVN DB in 172.19.104.78,
customresourcedefinition.apiextensions.k8s.io/ips.kubeovn.io created
customresourcedefinition.apiextensions.k8s.io/subnets.kubeovn.io created
customresourcedefinition.apiextensions.k8s.io/vlans.kubeovn.io created
configmap/ovn-config created
serviceaccount/ovn created
clusterrole.rbac.authorization.k8s.io/system:ovn created
clusterrolebinding.rbac.authorization.k8s.io/ovn created
service/ovn-nb created
service/ovn-sb created
deployment.apps/ovn-central created
daemonset.apps/ovs-ovn created
Waiting for deployment "ovn-central" rollout to finish: 0 of 1 updated replicas are available...
deployment "ovn-central" successfully rolled out
[Step 3] Install Kube-OVN
deployment.apps/kube-ovn-controller created
daemonset.apps/kube-ovn-cni created
daemonset.apps/kube-ovn-pinger created
service/kube-ovn-pinger created
service/kube-ovn-controller created
service/kube-ovn-cni created
Waiting for deployment "kube-ovn-controller" rollout to finish: 0 of 1 updated replicas are available...
deployment "kube-ovn-controller" successfully rolled out
[Step 4] Delete pod that not in host network mode
pod "dnsutils" deleted
pod "coredns-86c58d9df4-7q2tw" deleted
pod "coredns-86c58d9df4-9fh8w" deleted
pod "kube-ovn-pinger-d9slf" deleted
Waiting for daemon set "kube-ovn-pinger" rollout to finish: 0 of 1 updated pods are available...
daemon set "kube-ovn-pinger" successfully rolled out
deployment "coredns" successfully rolled out
[Step 5] Install kubectl plugin
[Step 6] Run network diagnose
NAME CREATED AT
subnets.kubeovn.io 2020-07-27T16:00:35Z
NAME CREATED AT
ips.kubeovn.io 2020-07-27T16:00:35Z
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 4d3h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4d3h
ds kube-proxy ready
deployment ovn-central ready
deployment kube-ovn-controller ready
ds kube-ovn-cni ready
ds ovs-ovn ready
deployment coredns ready
### kube-ovn-controller recent log
### start to diagnose node k8s-cmk
#### ovn-controller log:
2020-07-27T16:01:42.452Z|00039|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connecting...
2020-07-27T16:01:42.452Z|00009|rconn(ovn_pinctrl0)|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
2020-07-27T16:01:42.453Z|00040|rconn|INFO|unix:/var/run/openvswitch/br-int.mgmt: connected
2020-07-27T16:01:51.221Z|00041|binding|INFO|Claiming lport coredns-86c58d9df4-5b69s.kube-system for this chassis.
2020-07-27T16:01:51.221Z|00042|binding|INFO|coredns-86c58d9df4-5b69s.kube-system: Claiming 00:00:00:AB:BC:D7 10.16.0.4
2020-07-27T16:02:00.381Z|00043|binding|INFO|Releasing lport kube-ovn-pinger-d9slf.kube-system from this chassis.
2020-07-27T16:02:00.382Z|00044|lflow|WARN|Dropped 109 log messages in last 62 seconds (most recently, 8 seconds ago) due to excessive rate
**2020-07-27T16:02:00.382Z|00045|lflow|WARN|error parsing actions "ct_lb(backends=172.19.104.78:6641);": Syntax error at `backends' expecting IP address.**
2020-07-27T16:02:03.459Z|00046|binding|INFO|Claiming lport kube-ovn-pinger-j2mhf.kube-system for this chassis.
2020-07-27T16:02:03.459Z|00047|binding|INFO|kube-ovn-pinger-j2mhf.kube-system: Claiming 00:00:00:E6:2D:E1 10.16.0.5
I0727 16:02:07.707722 26334 ovn.go:19] ovs-vswitchd and ovsdb are up
I0727 16:02:07.807987 26334 ovn.go:31] ovn_controller is up
I0727 16:02:07.808051 26334 ovn.go:36] start to check port binding
I0727 16:02:07.905665 26334 ovn.go:109] chassis id is 8a251851-87c3-495a-bf82-6a7de9ffdaed
I0727 16:02:07.910573 26334 ovn.go:46] port in sb is [coredns-86c58d9df4-mzs7w.kube-system coredns-86c58d9df4-5b69s.kube-system node-k8s-cmk kube-ovn-pinger-j2mhf.kube-system ]
I0727 16:02:07.910602 26334 ovn.go:57] ovs and ovn-sb binding check passed
I0727 16:02:07.910619 26334 ping.go:191] start to check apiserver connectivity
I0727 16:02:07.915341 26334 ping.go:200] connect to apiserver success in 4.71ms
I0727 16:02:07.915376 26334 ping.go:47] start to check node connectivity
I0727 16:02:08.218419 26334 ping.go:69] ping node: k8s-cmk 172.19.104.78, count: 3, loss count 0, average rtt 0.19ms
I0727 16:02:08.218553 26334 ping.go:85] start to check pod connectivity
I0727 16:02:08.340613 26334 ping.go:112] ping pod: kube-ovn-pinger-j2mhf 10.16.0.5, count: 3, loss count 0, average rtt 0.15ms
I0727 16:02:08.340680 26334 ping.go:157] start to check dns connectivity
E0727 16:02:18.341419 26334 ping.go:165] failed to resolve dns kubernetes.default, lookup kubernetes.default on 10.96.0.10:53: dial udp 10.96.0.10:53: i/o timeout
I0727 16:02:18.341572 26334 ping.go:174] start to check dns connectivity
E0727 16:02:28.341851 26334 ping.go:182] failed to resolve dns alauda.cn, lookup alauda.cn on 10.96.0.10:53: dial udp 10.96.0.10:53: i/o timeout
I0727 16:02:28.341885 26334 ping.go:132] start to check ping external to 114.114.114.114
I0727 16:02:33.348838 26334 ping.go:145] ping external address: 114.114.114.114, total count: 3, loss count 3, average rtt 0.00ms
### finish diagnose node k8s-cmk
# ovs-vsctl show
2ca918fa-4b9b-4142-abb9-f1f471a7db3f
Bridge br-int
fail_mode: secure
Port e7c0b5d258e9_h
Interface e7c0b5d258e9_h
Port "616337a17ecd_h"
Interface "616337a17ecd_h"
Port ovn0
Interface ovn0
type: internal
Port "4edd7ac31b96_h"
Interface "4edd7ac31b96_h"
Port mirror0
Interface mirror0
type: internal
Port br-int
Interface br-int
type: internal
ovs_version: "2.13.0"
Next, created a dnsutils pod to check DNS resolution:
# cat dnsutil.yaml
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
nodeName: k8s-cmk
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
$ kubectl create -f dnsutil.yaml
pod/dnsutils created
$ kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; reply from unexpected source: 10.16.0.3#53, expected 10.96.0.10#53
;; reply from unexpected source: 10.16.0.3#53, expected 10.96.0.10#53
;; reply from unexpected source: 10.16.0.3#53, expected 10.96.0.10#53
;; connection timed out; no servers could be reached
command terminated with exit code 1
Here are the trace of dnsutils container:
# kubectl ko trace default/dnsutils 10.96.0.10 udp 53
+ kubectl exec ovn-central-5b86b448c8-hpswd -n kube-system -- ovn-trace --ct=new ovn-default 'inport == "dnsutils.default" && ip.ttl == 64 && eth.src == 00:00:00:5B:25:05 && ip4.src == 10.16.0.6 && eth.dst == 00:00:00:60:98:ED && ip4.dst == 10.96.0.10 && udp.src == 10000 && udp.dst == 53'
udp,reg14=0x6,vlan_tci=0x0000,dl_src=00:00:00:5b:25:05,dl_dst=00:00:00:60:98:ed,nw_src=10.16.0.6,nw_dst=10.96.0.10,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=10000,tp_dst=53
ingress(dp="ovn-default", inport="dnsutils.default")
0. ls_in_port_sec_l2 (ovn-northd.c:4629): inport == "dnsutils.default" && eth.src == {00:00:00:5b:25:05}, priority 50, uuid 2f5d5f37
next;
1. ls_in_port_sec_ip (ovn-northd.c:4281): inport == "dnsutils.default" && eth.src == 00:00:00:5b:25:05 && ip4.src == {10.16.0.6}, priority 90, uuid d03101ee
next;
3. ls_in_pre_acl (ovn-northd.c:4805): ip, priority 100, uuid 2dae4354
reg0[0] = 1;
next;
4. ls_in_pre_lb (ovn-northd.c:4961): ip && ip4.dst == 10.96.0.10, priority 100, uuid 3b36d220
reg0[0] = 1;
next;
5. ls_in_pre_stateful (ovn-northd.c:4992): reg0[0] == 1, priority 100, uuid 0baa7dcb
ct_next;
ct_next(ct_state=new|trk)
6. ls_in_acl (ovn-northd.c:5368): ip && (!ct.est || (ct.est && ct_label.blocked == 1)), priority 1, uuid 606385b4
reg0[1] = 1;
next;
10. ls_in_stateful (ovn-northd.c:5726): ct.new && ip4.dst == 10.96.0.10 && udp.dst == 53, priority 120, uuid 78fc03a9
ct_lb(backends=10.16.0.3:53,10.16.0.4:53);
ct_lb
19. ls_in_l2_lkup (ovn-northd.c:6912): eth.dst == 00:00:00:60:98:ed, priority 50, uuid 201eb38a
outport = "ovn-default-ovn-cluster";
output;
egress(dp="ovn-default", inport="dnsutils.default", outport="ovn-default-ovn-cluster")
0. ls_out_pre_lb (ovn-northd.c:4977): ip, priority 100, uuid bdcf2073
reg0[0] = 1;
next;
1. ls_out_pre_acl (ovn-northd.c:4748): ip && outport == "ovn-default-ovn-cluster", priority 110, uuid bb8efdec
next;
2. ls_out_pre_stateful (ovn-northd.c:4994): reg0[0] == 1, priority 100, uuid 8b5986dc
ct_next;
ct_next(ct_state=est|trk /* default (use --ct to customize) */)
3. ls_out_lb (ovn-northd.c:5609): ct.est && !ct.rel && !ct.new && !ct.inv, priority 65535, uuid 5afe0141
reg0[2] = 1;
next;
7. ls_out_stateful (ovn-northd.c:5771): reg0[2] == 1, priority 100, uuid 72f3d32f
ct_lb;
ct_lb
9. ls_out_port_sec_l2 (ovn-northd.c:4695): outport == "ovn-default-ovn-cluster", priority 50, uuid 9014757c
output;
/* output to "ovn-default-ovn-cluster", type "patch" */
ingress(dp="ovn-cluster", inport="ovn-cluster-ovn-default")
0. lr_in_admission (ovn-northd.c:7974): eth.dst == 00:00:00:60:98:ed && inport == "ovn-cluster-ovn-default", priority 50, uuid 74e286f4
next;
1. lr_in_lookup_neighbor (ovn-northd.c:8023): 1, priority 0, uuid 9dc0e8d5
reg9[3] = 1;
next;
2. lr_in_learn_neighbor (ovn-northd.c:8029): reg9[3] == 1 || reg9[2] == 1, priority 100, uuid df2a15d3
next;
9. lr_in_ip_routing (ovn-northd.c:7598): ip4.dst == 10.16.0.0/16, priority 33, uuid 79beccf5
ip.ttl--;
reg8[0..15] = 0;
reg0 = ip4.dst;
reg1 = 10.16.0.1;
eth.src = 00:00:00:60:98:ed;
outport = "ovn-cluster-ovn-default";
flags.loopback = 1;
next;
10. lr_in_ip_routing_ecmp (ovn-northd.c:9593): reg8[0..15] == 0, priority 150, uuid f90f9afb
next;
12. lr_in_arp_resolve (ovn-northd.c:9861): outport == "ovn-cluster-ovn-default" && reg0 == 10.16.0.3, priority 100, uuid b1f6165b
eth.dst = 00:00:00:2f:6c:a1;
next;
16. lr_in_arp_request (ovn-northd.c:10265): 1, priority 0, uuid 232eb46a
output;
egress(dp="ovn-cluster", inport="ovn-cluster-ovn-default", outport="ovn-cluster-ovn-default")
3. lr_out_delivery (ovn-northd.c:10311): outport == "ovn-cluster-ovn-default", priority 100, uuid 4fa3621b
output;
/* output to "ovn-cluster-ovn-default", type "patch" */
ingress(dp="ovn-default", inport="ovn-default-ovn-cluster")
0. ls_in_port_sec_l2 (ovn-northd.c:4629): inport == "ovn-default-ovn-cluster", priority 50, uuid 1b60f7f1
next;
3. ls_in_pre_acl (ovn-northd.c:4745): ip && inport == "ovn-default-ovn-cluster", priority 110, uuid f174f178
next;
9. ls_in_lb (ovn-northd.c:5606): ct.est && !ct.rel && !ct.new && !ct.inv, priority 65535, uuid 89287636
reg0[2] = 1;
next;
10. ls_in_stateful (ovn-northd.c:5769): reg0[2] == 1, priority 100, uuid 5b17783f
ct_lb;
ct_lb
19. ls_in_l2_lkup (ovn-northd.c:6912): eth.dst == 00:00:00:2f:6c:a1, priority 50, uuid 5037006d
outport = "coredns-86c58d9df4-mzs7w.kube-system";
output;
egress(dp="ovn-default", inport="ovn-default-ovn-cluster", outport="coredns-86c58d9df4-mzs7w.kube-system")
0. ls_out_pre_lb (ovn-northd.c:4977): ip, priority 100, uuid bdcf2073
reg0[0] = 1;
next;
1. ls_out_pre_acl (ovn-northd.c:4807): ip, priority 100, uuid c4600229
reg0[0] = 1;
next;
2. ls_out_pre_stateful (ovn-northd.c:4994): reg0[0] == 1, priority 100, uuid 8b5986dc
ct_next;
ct_next(ct_state=est|trk /* default (use --ct to customize) */)
3. ls_out_lb (ovn-northd.c:5609): ct.est && !ct.rel && !ct.new && !ct.inv, priority 65535, uuid 5afe0141
reg0[2] = 1;
next;
7. ls_out_stateful (ovn-northd.c:5771): reg0[2] == 1, priority 100, uuid 72f3d32f
ct_lb;
ct_lb
8. ls_out_port_sec_ip (ovn-northd.c:4281): outport == "coredns-86c58d9df4-mzs7w.kube-system" && eth.dst == 00:00:00:2f:6c:a1 && ip4.dst == {255.255.255.255, 224.0.0.0/4, 10.16.0.3, 10.16.255.255}, priority 90, uuid 6d690924
next;
9. ls_out_port_sec_l2 (ovn-northd.c:4695): outport == "coredns-86c58d9df4-mzs7w.kube-system" && eth.dst == {00:00:00:2f:6c:a1}, priority 50, uuid b8040f37
output;
/* output to "coredns-86c58d9df4-mzs7w.kube-system", type "" */
+ set +x
Start OVS Tracing
+ kubectl exec ovs-ovn-pwnn2 -n kube-system -- ovs-appctl ofproto/trace br-int in_port=7,udp,nw_src=10.16.0.6,nw_dst=10.96.0.10,dl_src=00:00:00:5B:25:05,dl_dst=00:00:00:60:98:ED,tp_src=1000,tp_dst=53
Bad openflow flow syntax: in_port=7,udp,nw_src=10.16.0.6,nw_dst=10.96.0.10,dl_src=00:00:00:5B:25:05,dl_dst=00:00:00:60:98:ED,tp_src=1000,tp_dst=53: prerequisites not met for setting tp_src
ovs-appctl: ovs-vswitchd: server returned an error
command terminated with exit code 2
# kubectl ko nbctl list load_balancer
_uuid : 49dec778-1311-4b0a-97c6-9520701b394c
external_ids : {}
health_check : []
ip_port_mappings : {}
name : cluster-udp-loadbalancer
protocol : udp
selection_fields : []
vips : {"10.96.0.10:53"="10.16.0.3:53,10.16.0.4:53"}
_uuid : 6b6117b9-06cf-46dd-9b6d-fe2c93f4d48b
external_ids : {}
health_check : []
ip_port_mappings : {}
name : cluster-tcp-loadbalancer
protocol : tcp
selection_fields : []
vips : {"10.102.217.222:8080"="10.16.0.5:8080", "10.105.5.57:6642"="172.19.104.78:6642", "10.107.210.62:10665"="172.19.104.78:10665", "10.109.99.141:10660"="172.19.104.78:10660", "10.96.0.10:53"="10.16.0.3:53,10.16.0.4:53", "10.96.0.1:443"="172.19.104.78:6443", "10.98.129.48:6641"="172.19.104.78:6641"}
# kubectl ko nbctl list logical_switch
_uuid : f95faea8-29ed-46fb-afc6-1b64aadb2585
acls : [f7142d45-3a4b-45d9-b689-97261fa20f12]
dns_records : []
external_ids : {}
forwarding_groups : []
load_balancer : []
name : join
other_config : {exclude_ips="100.64.0.1", gateway="100.64.0.1", subnet="100.64.0.0/16"}
ports : [294b8e04-65d1-43a5-906d-c0b7626609b1, c7b01424-549a-4da6-b511-df9ec3121fe8]
qos_rules : []
_uuid : 5612edc2-5b86-4b2c-92a8-a34145594050
acls : [612890f1-02be-4f92-9336-65d3a3bd170a]
dns_records : []
external_ids : {}
forwarding_groups : []
load_balancer : [49dec778-1311-4b0a-97c6-9520701b394c, 6b6117b9-06cf-46dd-9b6d-fe2c93f4d48b]
name : ovn-default
other_config : {exclude_ips="10.16.0.1", gateway="10.16.0.1", subnet="10.16.0.0/16"}
ports : [66b1b966-617a-49ca-993d-fcbe2631114a, 7d0afb0e-f539-4087-acd8-bbc7e64319f9, 8661b1f5-ff50-4ddc-8f8c-a9fa62922e55, d1528506-3ca1-46bc-938f-dc8bc6e7ba60, e358d647-b3ad-419c-9f76-300fd10a2616]
qos_rules : []
# ovs-vsctl show
2ca918fa-4b9b-4142-abb9-f1f471a7db3f
Bridge br-int
fail_mode: secure
Port e7c0b5d258e9_h
Interface e7c0b5d258e9_h
Port "616337a17ecd_h"
Interface "616337a17ecd_h"
Port ovn0
Interface ovn0
type: internal
Port "4edd7ac31b96_h"
Interface "4edd7ac31b96_h"
Port mirror0
Interface mirror0
type: internal
Port br-int
Interface br-int
type: internal
Port "6784278d5de9_h"
Interface "6784278d5de9_h"
ovs_version: "2.13.0"
Here are the result of iptables-save:
# /sbin/iptables-save
# Generated by iptables-save v1.6.1 on Mon Jul 27 16:25:36 2020
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:DOCKER - [0:0]
:KUBE-MARK-DROP - [0:0]
:KUBE-MARK-MASQ - [0:0]
:KUBE-NODEPORTS - [0:0]
:KUBE-POSTROUTING - [0:0]
:KUBE-SEP-4WBAKYDVK6ETRTWV - [0:0]
:KUBE-SEP-GNPEEDHUT3X5BTQ2 - [0:0]
:KUBE-SEP-HOXEG3LV24IKNRD4 - [0:0]
:KUBE-SEP-IO5JTSCH6AE6LRGT - [0:0]
:KUBE-SEP-JQ4ETHVACARMFAQD - [0:0]
:KUBE-SEP-MLXD6UJQEFGX5RD5 - [0:0]
:KUBE-SEP-WSBAZPH2546LICO3 - [0:0]
:KUBE-SEP-X6SGP46MDKTGB5MV - [0:0]
:KUBE-SEP-XNTS2RWA5HZBHOH2 - [0:0]
:KUBE-SEP-YAC5AR73U2VWP4H6 - [0:0]
:KUBE-SERVICES - [0:0]
:KUBE-SVC-6AIQD4SL773RRPTB - [0:0]
:KUBE-SVC-BNPKMPMIBVVFQWV3 - [0:0]
:KUBE-SVC-E5GKJS2NXRRFQQE4 - [0:0]
:KUBE-SVC-ERIFXISQEP7F7OF4 - [0:0]
:KUBE-SVC-I2QZMQCSEPOODC3G - [0:0]
:KUBE-SVC-NPX46M4PTMTKRN6Y - [0:0]
:KUBE-SVC-NRUOEUPDVVQLVMEC - [0:0]
:KUBE-SVC-TCOU7JCQXEZGVUNU - [0:0]
-A PREROUTING -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A OUTPUT -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -m set --match-set ovn40subnets-nat src -m set ! --match-set ovn40subnets dst -j MASQUERADE
-A POSTROUTING -m set --match-set ovn40local-pod-ip-nat src -m set ! --match-set ovn40subnets dst -j MASQUERADE
-A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A DOCKER -i docker0 -j RETURN
-A KUBE-MARK-DROP -j MARK --set-xmark 0x8000/0x8000
-A KUBE-MARK-MASQ -j MARK --set-xmark 0x4000/0x4000
-A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -m mark --mark 0x4000/0x4000 -j MASQUERADE
-A KUBE-SEP-4WBAKYDVK6ETRTWV -s 172.19.104.78/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-4WBAKYDVK6ETRTWV -p tcp -m tcp -j DNAT --to-destination 172.19.104.78:6443
-A KUBE-SEP-GNPEEDHUT3X5BTQ2 -s 10.16.0.4/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-GNPEEDHUT3X5BTQ2 -p udp -m udp -j DNAT --to-destination 10.16.0.4:53
-A KUBE-SEP-HOXEG3LV24IKNRD4 -s 172.19.104.78/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-HOXEG3LV24IKNRD4 -p tcp -m tcp -j DNAT --to-destination 172.19.104.78:6642
-A KUBE-SEP-IO5JTSCH6AE6LRGT -s 10.16.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-IO5JTSCH6AE6LRGT -p udp -m udp -j DNAT --to-destination 10.16.0.3:53
-A KUBE-SEP-JQ4ETHVACARMFAQD -s 172.19.104.78/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-JQ4ETHVACARMFAQD -p tcp -m tcp -j DNAT --to-destination 172.19.104.78:10665
-A KUBE-SEP-MLXD6UJQEFGX5RD5 -s 10.16.0.3/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-MLXD6UJQEFGX5RD5 -p tcp -m tcp -j DNAT --to-destination 10.16.0.3:53
-A KUBE-SEP-WSBAZPH2546LICO3 -s 10.16.0.4/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-WSBAZPH2546LICO3 -p tcp -m tcp -j DNAT --to-destination 10.16.0.4:53
-A KUBE-SEP-X6SGP46MDKTGB5MV -s 172.19.104.78/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-X6SGP46MDKTGB5MV -p tcp -m tcp -j DNAT --to-destination 172.19.104.78:6641
-A KUBE-SEP-XNTS2RWA5HZBHOH2 -s 172.19.104.78/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-XNTS2RWA5HZBHOH2 -p tcp -m tcp -j DNAT --to-destination 172.19.104.78:10660
-A KUBE-SEP-YAC5AR73U2VWP4H6 -s 10.16.0.5/32 -j KUBE-MARK-MASQ
-A KUBE-SEP-YAC5AR73U2VWP4H6 -p tcp -m tcp -j DNAT --to-destination 10.16.0.5:8080
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.102.217.222/32 -p tcp -m comment --comment "kube-system/kube-ovn-pinger:metrics cluster IP" -m tcp --dport 8080 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.102.217.222/32 -p tcp -m comment --comment "kube-system/kube-ovn-pinger:metrics cluster IP" -m tcp --dport 8080 -j KUBE-SVC-BNPKMPMIBVVFQWV3
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.109.99.141/32 -p tcp -m comment --comment "kube-system/kube-ovn-controller:metrics cluster IP" -m tcp --dport 10660 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.109.99.141/32 -p tcp -m comment --comment "kube-system/kube-ovn-controller:metrics cluster IP" -m tcp --dport 10660 -j KUBE-SVC-I2QZMQCSEPOODC3G
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.107.210.62/32 -p tcp -m comment --comment "kube-system/kube-ovn-cni:metrics cluster IP" -m tcp --dport 10665 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.107.210.62/32 -p tcp -m comment --comment "kube-system/kube-ovn-cni:metrics cluster IP" -m tcp --dport 10665 -j KUBE-SVC-NRUOEUPDVVQLVMEC
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p udp -m comment --comment "kube-system/kube-dns:dns cluster IP" -m udp --dport 53 -j KUBE-SVC-TCOU7JCQXEZGVUNU
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.10/32 -p tcp -m comment --comment "kube-system/kube-dns:dns-tcp cluster IP" -m tcp --dport 53 -j KUBE-SVC-ERIFXISQEP7F7OF4
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.98.129.48/32 -p tcp -m comment --comment "kube-system/ovn-nb:ovn-nb cluster IP" -m tcp --dport 6641 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.98.129.48/32 -p tcp -m comment --comment "kube-system/ovn-nb:ovn-nb cluster IP" -m tcp --dport 6641 -j KUBE-SVC-6AIQD4SL773RRPTB
-A KUBE-SERVICES ! -s 10.16.0.0/16 -d 10.105.5.57/32 -p tcp -m comment --comment "kube-system/ovn-sb:ovn-sb cluster IP" -m tcp --dport 6642 -j KUBE-MARK-MASQ
-A KUBE-SERVICES -d 10.105.5.57/32 -p tcp -m comment --comment "kube-system/ovn-sb:ovn-sb cluster IP" -m tcp --dport 6642 -j KUBE-SVC-E5GKJS2NXRRFQQE4
-A KUBE-SERVICES -m comment --comment "kubernetes service nodeports; NOTE: this must be the last rule in this chain" -m addrtype --dst-type LOCAL -j KUBE-NODEPORTS
-A KUBE-SVC-6AIQD4SL773RRPTB -j KUBE-SEP-X6SGP46MDKTGB5MV
-A KUBE-SVC-BNPKMPMIBVVFQWV3 -j KUBE-SEP-YAC5AR73U2VWP4H6
-A KUBE-SVC-E5GKJS2NXRRFQQE4 -j KUBE-SEP-HOXEG3LV24IKNRD4
-A KUBE-SVC-ERIFXISQEP7F7OF4 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-MLXD6UJQEFGX5RD5
-A KUBE-SVC-ERIFXISQEP7F7OF4 -j KUBE-SEP-WSBAZPH2546LICO3
-A KUBE-SVC-I2QZMQCSEPOODC3G -j KUBE-SEP-XNTS2RWA5HZBHOH2
-A KUBE-SVC-NPX46M4PTMTKRN6Y -j KUBE-SEP-4WBAKYDVK6ETRTWV
-A KUBE-SVC-NRUOEUPDVVQLVMEC -j KUBE-SEP-JQ4ETHVACARMFAQD
-A KUBE-SVC-TCOU7JCQXEZGVUNU -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-IO5JTSCH6AE6LRGT
-A KUBE-SVC-TCOU7JCQXEZGVUNU -j KUBE-SEP-GNPEEDHUT3X5BTQ2
COMMIT
# Completed on Mon Jul 27 16:25:36 2020
# Generated by iptables-save v1.6.1 on Mon Jul 27 16:25:36 2020
*filter
:INPUT ACCEPT [49:8022]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [58:10196]
:DOCKER - [0:0]
:DOCKER-ISOLATION-STAGE-1 - [0:0]
:DOCKER-ISOLATION-STAGE-2 - [0:0]
:DOCKER-USER - [0:0]
:KUBE-EXTERNAL-SERVICES - [0:0]
:KUBE-FIREWALL - [0:0]
:KUBE-FORWARD - [0:0]
:KUBE-SERVICES - [0:0]
-A INPUT -m set --match-set ovn40subnets dst -j ACCEPT
-A INPUT -m set --match-set ovn40subnets src -j ACCEPT
-A INPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes externally-visible service portals" -j KUBE-EXTERNAL-SERVICES
-A INPUT -j KUBE-FIREWALL
-A FORWARD -o ovn0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i ovn0 -j ACCEPT
-A FORWARD -m comment --comment "kubernetes forwarding rules" -j KUBE-FORWARD
-A FORWARD -j DOCKER-USER
-A FORWARD -j DOCKER-ISOLATION-STAGE-1
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A OUTPUT -m conntrack --ctstate NEW -m comment --comment "kubernetes service portals" -j KUBE-SERVICES
-A OUTPUT -j KUBE-FIREWALL
-A DOCKER-ISOLATION-STAGE-1 -i docker0 ! -o docker0 -j DOCKER-ISOLATION-STAGE-2
-A DOCKER-ISOLATION-STAGE-1 -j RETURN
-A DOCKER-ISOLATION-STAGE-2 -o docker0 -j DROP
-A DOCKER-ISOLATION-STAGE-2 -j RETURN
-A DOCKER-USER -j RETURN
-A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
-A KUBE-FORWARD -m comment --comment "kubernetes forwarding rules" -m mark --mark 0x4000/0x4000 -j ACCEPT
-A KUBE-FORWARD -s 10.16.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod source rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A KUBE-FORWARD -d 10.16.0.0/16 -m comment --comment "kubernetes forwarding conntrack pod destination rule" -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
COMMIT
# Completed on Mon Jul 27 16:25:36 2020
Maybe the version of ovn-controller in ovs-dpdk is not match the ovn-db.
Can you run kubectl ko nbctl --version
and ovn-controller -V
in ovs-dpdk container to check the version.
Here are the results of the two command:
# kubectl ko nbctl --version
ovn-nbctl 20.06.0
Open vSwitch Library 2.13.0
DB Schema 5.23.0
# ovn-controller -V
ovn-controller 20.03.1
Open vSwitch Library 2.13.0
OpenFlow versions 0x4:0x4
Can you try v1.2.1? The ovs-dpdk image needs to be update to match the master ovn version.
We will try to update the ovs-dpdk image ASAP
Okay... i will try and post the result. Btw, i am using the installation script (install.sh) on Kubernetes 1.13.5, does this installation script support this kubernetes 1.13.5 as i see different yaml files (in kube-ovn-1.2.1\yamls directory) for kuberentes >=1.17 (ovn.yml, kube-ovn.yml) and kuberentes < 1.17 (ovn-pre17.yml, kube-ovn-pre17).
The main difference in the yamls are some node labels. The install.sh scripts will resolve the issues
Yes, i did observe the following:
Installation script is deploying the kube-ovn.yml and ovn.yml in any kubernetes environment be it 1.13.5 or greater.
For kube-ovn.yml, installation script is taking care of overwriting the node label: kubectl label no -lbeta.kubernetes.io/os=linux kubernetes.io/os=linux --overwrite kubectl label no -lnode-role.kubernetes.io/master kube-ovn/role=master --overwrite
For ovn.yml i. mount point like /etc/origin/ovn is not getting used in ovn-pre17.yml ii. kubeovn/kube-ovn:v1.2.1 is used in ovn-pre17.yml however in ovn.yml kubeovn/kube-ovn:v1.2.0 is getting used. iii. vlans resource missing in ClusterRole's kubeovn.io apiGroups in ovn-pre17.yml iv. deployments missing in ClusterRole's extensions apiGroups in ovn-pre17.yml
Does above will impact when kube-ovn (with or without DPDK) will be installed in Kubernetes 1.13.5? What is the recommended kubernetes version?
I tried the kubeovn version 1.2.1, service DNS is getting resolved from master node, however not getting resolved from minion node.
netcat on dns service IP and Pod is getting succeeded from both master and minion node:
$ nc -vz -u 10.96.0.10 53
Connection to 10.96.0.10 53 port [udp/domain] succeeded!
Also observe issues during the installation:
# ./kubeovn_install_121.sh --with-dpdk=19.11
[Step 1] Label kube-ovn-master node
node/k8s-cmk not labeled
node/k8s-minion labeled
node/k8s-cmk labeled
-------------------------------
[Step 2] Install OVN components
Install OVN DB in 172.19.104.78,
customresourcedefinition.apiextensions.k8s.io/ips.kubeovn.io created
customresourcedefinition.apiextensions.k8s.io/subnets.kubeovn.io created
customresourcedefinition.apiextensions.k8s.io/vlans.kubeovn.io created
configmap/ovn-config created
serviceaccount/ovn created
clusterrole.rbac.authorization.k8s.io/system:ovn created
clusterrolebinding.rbac.authorization.k8s.io/ovn created
service/ovn-nb created
service/ovn-sb created
deployment.apps/ovn-central created
daemonset.apps/ovs-ovn created
Waiting for deployment "ovn-central" rollout to finish: 0 of 1 updated replicas are available...
deployment "ovn-central" successfully rolled out
-------------------------------
[Step 3] Install Kube-OVN
deployment.apps/kube-ovn-controller created
daemonset.apps/kube-ovn-cni created
daemonset.apps/kube-ovn-pinger created
service/kube-ovn-pinger created
service/kube-ovn-controller created
service/kube-ovn-cni created
Waiting for deployment "kube-ovn-controller" rollout to finish: 0 of 1 updated replicas are available...
deployment "kube-ovn-controller" successfully rolled out
-------------------------------
[Step 4] Delete pod that not in host network mode
pod "dnsutils" deleted
pod "coredns-86c58d9df4-5fq8n" deleted
pod "coredns-86c58d9df4-hpqzn" deleted
pod "kube-ovn-pinger-cjb2q" deleted
pod "kube-ovn-pinger-ps59m" deleted
Waiting for daemon set "kube-ovn-pinger" rollout to finish: 0 of 2 updated pods are available...
Waiting for daemon set "kube-ovn-pinger" rollout to finish: 1 of 2 updated pods are available...
daemon set "kube-ovn-pinger" successfully rolled out
deployment "coredns" successfully rolled out
-------------------------------
[Step 5] Install kubectl plugin
-------------------------------
[Step 6] Run network diagnose
NAME CREATED AT
subnets.kubeovn.io 2020-07-28T07:14:34Z
NAME CREATED AT
ips.kubeovn.io 2020-07-28T07:14:34Z
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 4d18h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4d18h
ds kube-proxy ready
deployment ovn-central ready
deployment kube-ovn-controller ready
ds kube-ovn-cni ready
ds ovs-ovn ready
deployment coredns ready
### kube-ovn-controller recent log
### start to diagnose node k8s-cmk
#### ovn-controller log:
2020-07-28T07:16:56.763Z|03955|binding|INFO|Releasing lport kube-ovn-pinger-zh96t.kube-system from this chassis.
2020-07-28T07:16:56.763Z|03956|binding|INFO|Claiming lport coredns-86c58d9df4-dhcmz.kube-system for this chassis.
2020-07-28T07:16:56.763Z|03957|binding|INFO|coredns-86c58d9df4-dhcmz.kube-system: Claiming 00:00:00:E4:F2:4E 10.16.0.6
2020-07-28T07:16:56.763Z|03958|binding|INFO|Claiming lport node-k8s-cmk for this chassis.
2020-07-28T07:16:56.763Z|03959|binding|INFO|node-k8s-cmk: Claiming 00:00:00:C9:62:32 100.64.0.2
2020-07-28T07:16:56.763Z|03960|binding|INFO|Claiming lport coredns-86c58d9df4-plpnq.kube-system for this chassis.
2020-07-28T07:16:56.763Z|03961|binding|INFO|coredns-86c58d9df4-plpnq.kube-system: Claiming 00:00:00:45:55:D3 10.16.0.4
2020-07-28T07:16:56.763Z|03962|binding|INFO|Releasing lport node-k8s-minion from this chassis.
2020-07-28T07:16:56.763Z|03963|binding|INFO|Claiming lport kube-ovn-pinger-6gzqj.kube-system for this chassis.
2020-07-28T07:16:56.763Z|03964|binding|INFO|kube-ovn-pinger-6gzqj.kube-system: Claiming 00:00:00:DA:84:4A 10.16.0.2
I0728 07:16:57.116355 372 ovn.go:19] ovs-vswitchd and ovsdb are up
I0728 07:16:57.208836 372 ovn.go:31] ovn_controller is up
I0728 07:16:57.208870 372 ovn.go:36] start to check port binding
E0728 07:17:02.228016 372 ovn.go:104] chassis for node k8s-cmk not exist
I0728 07:17:02.228036 372 ping.go:167] start to check apiserver connectivity
I0728 07:17:10.979301 372 ping.go:176] connect to apiserver success in 8751.25ms
I0728 07:17:10.979336 372 ping.go:40] start to check node connectivity
I0728 07:17:11.293857 372 ping.go:62] ping node: k8s-cmk 172.19.104.78, count: 3, loss count 0, average rtt 1.14ms
I0728 07:17:41.368701 372 ping.go:62] ping node: k8s-minion 172.19.104.81, count: 3, loss count 3, average rtt 0.00ms
I0728 07:17:41.368921 372 ping.go:78] start to check pod connectivity
E0728 07:17:56.369890 372 ping.go:81] failed to get peer ds: Get "https://10.96.0.1:443/apis/apps/v1/namespaces/kube-system/daemonsets/kube-ovn-pinger?timeout=15s": net/http: request canceled (Client.Timeout exceeded while awaiting headers)
I0728 07:17:56.369991 372 ping.go:150] start to check dns connectivity
I0728 07:18:01.376392 372 ping.go:163] resolve dns kubernetes.default to [10.96.0.1] in 5006.36ms
### finish diagnose node k8s-cmk
### start to diagnose node k8s-minion
#### ovn-controller log:
2020-07-28T07:18:01.333Z|21363|binding|INFO|node-k8s-minion: Claiming 00:00:00:4C:4A:42 100.64.0.3
2020-07-28T07:18:01.333Z|21364|binding|INFO|Releasing lport kube-ovn-pinger-6gzqj.kube-system from this chassis.
2020-07-28T07:18:01.360Z|21365|binding|INFO|Releasing lport node-k8s-cmk from this chassis.
2020-07-28T07:18:01.360Z|21366|binding|INFO|Claiming lport kube-ovn-pinger-zh96t.kube-system for this chassis.
2020-07-28T07:18:01.360Z|21367|binding|INFO|kube-ovn-pinger-zh96t.kube-system: Claiming 00:00:00:9C:D3:14 10.16.0.3
2020-07-28T07:18:01.360Z|21368|binding|INFO|Releasing lport coredns-86c58d9df4-dhcmz.kube-system from this chassis.
2020-07-28T07:18:01.360Z|21369|binding|INFO|Releasing lport coredns-86c58d9df4-plpnq.kube-system from this chassis.
2020-07-28T07:18:01.360Z|21370|binding|INFO|Claiming lport node-k8s-minion for this chassis.
2020-07-28T07:18:01.360Z|21371|binding|INFO|node-k8s-minion: Claiming 00:00:00:4C:4A:42 100.64.0.3
2020-07-28T07:18:01.360Z|21372|binding|INFO|Releasing lport kube-ovn-pinger-6gzqj.kube-system from this chassis.
I0728 07:18:01.690993 16124 ovn.go:19] ovs-vswitchd and ovsdb are up
I0728 07:18:01.792199 16124 ovn.go:31] ovn_controller is up
I0728 07:18:01.792234 16124 ovn.go:36] start to check port binding
E0728 07:18:11.804409 16124 ovn.go:100] failed to find chassis signal: alarm clock
I0728 07:18:11.804471 16124 ping.go:167] start to check apiserver connectivity
E0728 07:18:26.804883 16124 ping.go:172] failed to connect to apiserver: Get "https://10.96.0.1:443/version?timeout=15s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0728 07:18:26.805015 16124 ping.go:40] start to check node connectivity
E0728 07:18:41.805389 16124 ping.go:43] failed to list nodes, Get "https://10.96.0.1:443/api/v1/nodes": context deadline exceeded
I0728 07:18:41.805438 16124 ping.go:78] start to check pod connectivity
E0728 07:18:56.805977 16124 ping.go:81] failed to get peer ds: Get "https://10.96.0.1:443/apis/apps/v1/namespaces/kube-system/daemonsets/kube-ovn-pinger?timeout=15s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0728 07:18:56.806315 16124 ping.go:150] start to check dns connectivity
E0728 07:19:06.807997 16124 ping.go:158] failed to resolve dns kubernetes.default, lookup kubernetes.default on 10.96.0.10:53: dial udp 10.96.0.10:53: i/o timeout
### finish diagnose node k8s-minion
-------------------------------
Please help.
I think the reason is still that k8s-minion can not register to ovn-sb and the geneve tunnel not setup.
Can you paste the full log of ovn-controller in /var/run/ovn/ovn-controller.log
Yes, geneve tunnels are not getting setup.
On K8s Master node: ovs-vsctl show 24d1bc59-e26e-4994-81c0-0e728ed12725 Bridge br-int fail_mode: secure Port c0761d0c3217_h Interface c0761d0c3217_h Port br-int Interface br-int type: internal Port "0edc7189dea0_h" Interface "0edc7189dea0_h" Port mirror0 Interface mirror0 type: internal Port "71be2ecb55a3_h" Interface "71be2ecb55a3_h" Port ovn0 Interface ovn0 type: internal Port "30218c0139f3_h" Interface "30218c0139f3_h" ovs_version: "2.13.0"
On K8s Minion Node ovs-vsctl show 84fc5f05-7271-4667-8816-a7756d9e6c98 Bridge br-int fail_mode: secure Port ovn0 Interface ovn0 type: internal Port "7128f7203e68_h" Interface "7128f7203e68_h" Port ea46baa2d71f_h Interface ea46baa2d71f_h Port br-int Interface br-int type: internal Port mirror0 Interface mirror0 type: internal ovs_version: "2.13.0"
Running trace on Pod running on minon does show packet getting drop:
PFA the OVN Controller Logs from master and minion nodes: ovn-controller-master.zip ovn-controller-minion.zip
I wonder if the ovn-controller cannot set the right encap-ip
.
You can run ovs-vsctl list open
in ovs-ovn
pod and check the external_ids field.
Did you set the iface
variable in install.sh
? By default, if not set this variable, kube-ovn-cni will use the ip of default route nic as encap-ip
.
On K8s Master kubectl exec -it ovs-ovn-tqpfc -n kube-system -- ovs-vsctl list open _uuid : 24d1bc59-e26e-4994-81c0-0e728ed12725 bridges : [f33d791d-cb3a-42ef-b149-990e54ff94e0] cur_cfg : 15 datapath_types : [netdev, system] datapaths : {} db_version : "8.2.0" dpdk_initialized : true dpdk_version : "DPDK 19.11.1" external_ids : {hostname=k8s-cmk, ovn-encap-ip="172.19.104.78", ovn-encap-type=geneve, ovn-openflow-probe-interval="180", ovn-remote="tcp:10.97.63.130:6642", ovn-remote-probe-interval="10000", rundir="/var/run/openvswitch", system-id=""} iface_types : [dpdk, dpdkr, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [] next_cfg : 15 other_config : {dpdk-hugepage-dir="/dev/hugepages", dpdk-init="true", dpdk-socket-mem="1024"} ovs_version : "2.13.0" ssl : [] statistics : {} system_type : fedora system_version : "32"
On K8s Minion kubectl exec -it ovs-ovn-6bqjt -n kube-system -- ovs-vsctl list open _uuid : 84fc5f05-7271-4667-8816-a7756d9e6c98 bridges : [7c4731a2-b9c5-426b-b80e-185639e869ac] cur_cfg : 9 datapath_types : [netdev, system] datapaths : {} db_version : "8.2.0" dpdk_initialized : true dpdk_version : "DPDK 19.11.1" external_ids : {hostname=k8s-minion, ovn-encap-ip="172.19.104.81", ovn-encap-type=geneve, ovn-openflow-probe-interval="180", ovn-remote="tcp:10.97.63.130:6642", ovn-remote-probe-interval="10000", rundir="/var/run/openvswitch", system-id=""} iface_types : [dpdk, dpdkr, dpdkvhostuser, dpdkvhostuserclient, erspan, geneve, gre, internal, ip6erspan, ip6gre, lisp, patch, stt, system, tap, vxlan] manager_options : [] next_cfg : 9 other_config : {dpdk-hugepage-dir="/dev/hugepages", dpdk-init="true", dpdk-socket-mem="1024"} ovs_version : "2.13.0" ssl : [] statistics : {} system_type : fedora system_version : "32"
No, i did not set the iface variable in install.sh
Btw, I observe few differences with the way we are building the container images of OVS-DPDK and OVS. Can that be reason for this issue?
In kube-ovn v1.2.1 (non DPDK), observe the following:
For OVS OVS branch used: branch-2.13 Patch Applied: curl https://github.com/alauda/ovs/commit/238003290766808ba310e1875157b3d414245603.patch | git apply
For OVN OVS branch used: branch-20.03 Patch Applied:
curl https://github.com/alauda/ovn/commit/19e802b80c866089af8f7a21512f68decc75a874.patch | git apply
curl https://github.com/oilbeater/ovn/commit/7e49a662d9a9d23d673958564048eee71dc941f0.patch | git apply
curl https://github.com/oilbeater/ovn/commit/9a4460caa1bffe8686c91309248a05f854fc1345.patch | git apply
In kube-ovn v1.2.1 (withDPDK), observe the following: OVN Version: 20.03 OVS Version: 2.13.0 DPDK Version: 19.11.1
Also, there are no such patches getting applied in Dockerfile.dpdk1911
These patches only works for ovn-northd not ovndb.
I notice that the system-id
is empty. That might be the reason the chassis didn't register to ovn-sb.
The id should be generated automatically at /etc/openvswitch/system-id.conf
. But I'm not aware of why it didn't be generated in your cluster
Okay. Please suggest, how to fix the same.
If the file exists at host /etc/origin/openvswitch/system-id.conf
but is empty, you can delete the file and delete the ovs-ovn pod, to see if it can regenerate the id
The file /etc/openvswitch/system-id.conf
neither exist on host nor in the ovs-ovn
pod of Kube-OVN with DPDK setup.
However this file exists in ovs-ovn
pod of non OVS-DPDK Kube-OVN installation. On host, it exists as /etc/origin/openvswitch/system-id.conf
Also noticed that chasis id is there in the earlier posted installation log of Kube-OVN with DPDK setup: ovn-controller log: ... I0727 16:02:07.905665 26334 ovn.go:109] chassis id is 8a251851-87c3-495a-bf82-6a7de9ffdaed
hmm.. It's weird, can you replace the ovs-ovn image by kubeovn/kube-ovn-dpdk:19.11.2.
I deploy a cluster with this image but didn't reproduce your issue
Okay i will change the image and post the result.
Meanwhile i observe that ovs-ctl
command in https://github.com/alauda/kube-ovn/blob/v1.2.1/dist/images/start-ovs.sh
container the --system-id=random
option. However there is no such option in the https://github.com/alauda/kube-ovn/blob/v1.2.1/dist/images/start-ovs-dpdk.sh
Can this be a issue?
Oh, yes. I think you find the root cause and this has been fixed in later commit.
:) ... Thanks with 19.11.2 service name are getting resolve. Also geneve tunnels are getting created.
However, i did observe following issue in the installation log:
E0728 15:35:10.969898 1 subnet.go:298] failed to update subnet join, Operation cannot be fulfilled on subnets.kubeovn.io "join": the object has been modified; please apply your changes to the latest version and try again E0728 15:35:10.969933 1 subnet.go:133] error syncing 'join': Operation cannot be fulfilled on subnets.kubeovn.io "join": the object has been modified; please apply your changes to the latest version and try again, requeuing
Can you please help with these as well?
kube-ovn-controller will retry failed event. If the error log didn't repeat again and again, it will be fine
fixed in #427
Hi,
I have deployed a multinode Kubernetes setup with kubeovn as default cni. I installed Kubeovn with DPDK support following the link https://github.com/alauda/kube-ovn/blob/master/docs/dpdk.md on openstack VMS with OVS-DPDK and virtio type of network interfaces attached to cluster vms.
But i am facing issue when my pods are scheduled on different nodes. They are not able to communicate with each other using Kubeovn interface also. I understand for DPDK based interfaces, this communication needs to be manually configured as userspace cni does not support this but Kubeovn interface communication must work fine. Same thing is working fine when i deploy kubeovn without DPDK support.
My Environment Details are OS: Ubuntu18 Virtual Machines over openstack with ovs-dpdk RAM: 16GB Cores: 8 Nic: Virtio network device K8s: Version 1.134 Kubeovn with DPDK: "v1.3.0-pre"
One thing i observed is there are no geneve ports added to br-int provided by Kubeovn in case DPDK is enabled
Kubeovn with DPDK
root@k8s-master:~# ovs-vsctl show 1d99a1eb-1d46-4016-8639-bc00ab08ca83 Bridge br-int fail_mode: secure Port br-int Interface br-int type: internal Port mirror0 Interface mirror0 type: internal Port "7cd52e4a918f_h" Interface "7cd52e4a918f_h" Port ovn0 Interface ovn0 type: internal ovs_version: "2.13.0"
Kubeovn Without DPDK
root@k8s2-master:~# ovs-vsctl show 42591383-8fd4-4d44-b9c9-90be02958d71 Bridge br-int fail_mode: secure Port ovn-9f840e-0 Interface ovn-9f840e-0 type: geneve options: {csum="true", key=flow, remote_ip=}
Port br-int
Interface br-int
type: internal
Port f7aaa44c4a5c_h
Interface f7aaa44c4a5c_h
Port ovn0
Interface ovn0
type: internal
Port ovn-b6bfb4-0
Interface ovn-b6bfb4-0
type: geneve
options: {csum="true", key=flow, remote_ip=}
Port ovn-7d41af-0
Interface ovn-7d41af-0
type: geneve
options: {csum="true", key=flow, remote_ip=}
Port mirror0
Interface mirror0
type: internal
Port "92756136d181_h"
Interface "92756136d181_h"
ovs_version: "2.13.0"
Even Kubernetes.deafult dns is not reachable in case DPDK is enabled for multihost K8s environment.
Here are the trace of dnsutils container:
~# kubectl ko trace default/dnsutils 10.96.0.10 udp 53
udp,reg14=0x6,vlan_tci=0x0000,dl_src=00:00:00:f7:e9:ee,dl_dst=00:00:00:dd:d2:bc,nw_src=10.16.0.6,nw_dst=10.96.0.10,nw_tos=0,nw_ecn=0,nw_ttl=64,tp_src=10000,tp_dst=53
ingress(dp="ovn-default", inport="dnsutils.default")
ct_next(ct_state=new|trk)
ct_lb
egress(dp="ovn-default", inport="dnsutils.default", outport="ovn-default-ovn-cluster")
ct_next(ct_state=est|trk / default (use --ct to customize) /)
ct_lb
ingress(dp="ovn-cluster", inport="ovn-cluster-ovn-default")
egress(dp="ovn-cluster", inport="ovn-cluster-ovn-default", outport="ovn-cluster-ovn-default")
ingress(dp="ovn-default", inport="ovn-default-ovn-cluster")
ct_lb
egress(dp="ovn-default", inport="ovn-default-ovn-cluster", outport="coredns-86c58d9df4-82qgb.kube-system")
ct_next(ct_state=est|trk / default (use --ct to customize) /)
ct_lb
Start OVS Tracing
kubectl ko nbctl list load_balancer
_uuid : f2db64c0-7c90-468f-bf51-9807b93229b2 external_ids : {} health_check : [] ip_port_mappings : {} name : cluster-tcp-loadbalancer protocol : tcp selection_fields : [] vips : {"10.100.21.21:10665"="172.19.104.78:10665", "10.101.2.32:10660"="172.19.104.78:10660", "10.105.152.58:6642"="172.19.104.78:6642", "10.107.165.205:8080"="10.16.0.5:8080", "10.96.0.10:53"="10.16.0.2:53,10.16.0.4:53", "10.96.0.1:443"="172.19.104.78:6443", "10.97.248.180:6641"="172.19.104.78:6641"}
_uuid : 2a4f9f2f-21cf-4bd6-a35e-d4575e7c9117 external_ids : {} health_check : [] ip_port_mappings : {} name : cluster-udp-loadbalancer protocol : udp selection_fields : [] vips : {"10.96.0.10:53"="10.16.0.2:53,10.16.0.4:53"}
kubectl ko nbctl list logical_switch
_uuid : 1e169528-a148-436f-811f-b3a83c089e04 acls : [821479e8-2151-4760-b27a-54d901ddfc70] dns_records : [] external_ids : {} forwarding_groups : [] load_balancer : [] name : join other_config : {exclude_ips="100.64.0.1", gateway="100.64.0.1", subnet="100.64.0.0/16"} ports : [1a6dcb70-084c-460e-a84f-7505f820f276, 1e4994f0-e749-4c8e-83e8-25502e11b769] qos_rules : []
_uuid : bdc0fac5-6caf-4800-bd23-8aaf80c273ce acls : [c9ca44b2-d3a6-4b74-b668-baef0dc32d67] dns_records : [] external_ids : {} forwarding_groups : [] load_balancer : [2a4f9f2f-21cf-4bd6-a35e-d4575e7c9117, f2db64c0-7c90-468f-bf51-9807b93229b2] name : ovn-default other_config : {exclude_ips="10.16.0.1", gateway="10.16.0.1", subnet="10.16.0.0/16"} ports : [64a29d31-c9d7-4719-a09b-911d816982af, b150f67a-7e55-4fa2-ad92-63c45e74cd7a, b155c0b3-40fb-41b0-9b9a-8a163bb3497c, ca1a9e8b-78d3-42ce-804d-0e2244316e77, f17ea181-9bc8-4f5d-91e9-e79fc6ddd95c] qos_rules : []