contiv / netplugin

Container networking for various use cases
Apache License 2.0
512 stars 177 forks source link

Access to K8S apiserver's clusterIP does not work with apiserver setup with HA #1152

Open mirwan opened 5 years ago

mirwan commented 5 years ago

Symptoms

Even if contiv components are running, no other pod can access apiserver via clusterIP (and fail CrashLoopBackOff) because of a "connection refused".

Explanation

In https://github.com/contiv/netplugin/blob/release-1.2/mgmtfn/k8splugin/kubeClient.go#L93, the serverURL variable (filled with the URL (IP+PORT) of the VIP which the apiservers are behind), defined in the configMap is used to set the APIClient apiServerPort field.

In the WatchServices function, the kubernetes svc enters the branch https://github.com/contiv/netplugin/blob/release-1.2/mgmtfn/k8splugin/kubeClient.go#L323 and the ProvPort field is set to this value.

When the apiserver's external VIP listens on a different port than the apiservers themselves (i.e. 6443)

  contiv_k8s_config: |-
    {
       "K8S_API_SERVER": "https://external:443",
       ...
      }

, the ovs flow is wrongly redirected to the masters IP (1.2.3.4 here) on this port instead the 6443

ovs-appctl bridge/dump-flows contivVlanBridge | grep 172.31.192.1
table_id=3, duration=4118s, n_packets=20, n_bytes=1480, priority=100,tcp,nw_src=172.31.128.24,nw_dst=172.31.192.1,tp_dst=443,actions=set_field:443->tcp_dst,set_field:1.2.3.4->ip_dst,goto_table:4
table_id=3, duration=4141s, n_packets=3, n_bytes=222, priority=10,tcp,nw_dst=172.31.192.1,actions=CONTROLLER:65535
table_id=6, duration=4118s, n_packets=19, n_bytes=1140, priority=100,tcp,nw_src=1.2.3.4,nw_dst=172.31.128.24,tp_src=443,actions=set_field:443->tcp_src,set_field:172.31.192.1->ip_src,goto_table:7

NB: 172.31.192.1 is the clusterIP of svc kubernetes

Workaround