Configured ks8-keep-alived daemonset and my deployment/service.
[root@bcmt-01-control-01 ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE
ddeio-6b4df765b4-6wgj7 1/1 Running 0 5s 192.168.1.84 bcmt-01-worker-01ddeio-6b4df765b4-bwk92 1/1 Running 0 5s 192.168.1.26 bcmt-01-worker-02kube-keepalived-vip-bsmsv 1/1 Running 15 1d 172.16.1.11 bcmt-01-worker-01kube-keepalived-vip-dkfd2 1/1 Running 16 1d 172.16.1.12 bcmt-01-worker-02
[root@bcmt-01-control-01 ~]# kubectl get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
dde-ext-io NodePort 10.254.9.232 3868:3868/TCP 2m app=ddeio
The service is available via the 10.75.117.72 VIP.
[root@bcmt-01-edge-01 ~]# netstat -anc | grep -w 3868
tcp 0 524 10.75.117.81:50842 10.75.117.72:3868 ESTABLISHED
tcp6 0 0 :::3868 :::* LISTEN
Out of the two worker nodes, the IP is plumbed on the worker-01
_[root@bcmt-01-worker-01 ~]# ip a
.....
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:02:02:93 brd ff:ff:ff:ff:ff:ff
inet 172.16.1.11/24 brd 172.16.1.255 scope global dynamic eth1
valid_lft 83687sec preferred_lft 83687sec
inet 10.75.117.72/32 scope global eth1
valid_lft forever preferredlft forever
After this, during an existing load session where traffic is being pumped, if I reboot the worker-01, the VIP gets plumbed on the other node in 2-3 seconds, but when the client tries to connect again, it's unable to do that unless the service becomes available after 23 secs and then the traffic resumes.
_Wed 2 Jan 07:15:14 UTC 2019
tcp 0 584 10.75.117.81:52702 10.75.117.72:3868 ESTABLISHED
Wed 2 Jan 07:15:15 UTC 2019
tcp 0 1109 10.75.117.81:52702 10.75.117.72:3868 FIN_WAIT1 - This is at the time of reboot of worker-01._
Then....
_Wed 2 Jan 07:15:17 UTC 2019
tcp 0 1 10.75.117.81:52700 10.75.117.72:3868 SYN_SENT
tcp 0 1109 10.75.117.81:52702 10.75.117.72:3868 FINWAIT1
.....
and so on
.....
So, it shows that it took 23 seconds for the connection to be established finally. In certain cases I have seen that number to be even 30-40 seconds.
Since my service is exposed via NodePort and the IP movement also happened within 2-3 seconds, should not the service be available immediately after the IP movement since one of the app POD on the leftover worker node can serve traffic. I tried this test numerous times to see if I am missing something. Finally, I tried https://github.com/munnerz/keepalived-cloud-provider where we use LoadBalancer rather than NodePort, and I could get the service reach-ability within 4 seconds after a node reboots. But, alas we can use the LB concept only in BM and not on cloud.
Can you please check this? Such a high failover time on node reboot certainly won't fit the bill.
PS: Even when exposing the service via ExternalIP, I see the same behaviour.
PS: When I kill one of the TWO app PODs, rather than node reboot, the service is almost instantly available (since it did not have to plumb the IP).
Configured ks8-keep-alived daemonset and my deployment/service.
[root@bcmt-01-control-01 ~]# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE ddeio-6b4df765b4-6wgj7 1/1 Running 0 5s 192.168.1.84 bcmt-01-worker-01
ddeio-6b4df765b4-bwk92 1/1 Running 0 5s 192.168.1.26 bcmt-01-worker-02
kube-keepalived-vip-bsmsv 1/1 Running 15 1d 172.16.1.11 bcmt-01-worker-01
kube-keepalived-vip-dkfd2 1/1 Running 16 1d 172.16.1.12 bcmt-01-worker-02
[root@bcmt-01-control-01 ~]# kubectl get svc -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR dde-ext-io NodePort 10.254.9.232 3868:3868/TCP 2m app=ddeio
The service is available via the 10.75.117.72 VIP. [root@bcmt-01-edge-01 ~]# netstat -anc | grep -w 3868 tcp 0 524 10.75.117.81:50842 10.75.117.72:3868 ESTABLISHED tcp6 0 0 :::3868 :::* LISTEN
Out of the two worker nodes, the IP is plumbed on the worker-01 _[root@bcmt-01-worker-01 ~]# ip a ..... 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 8950 qdisc pfifo_fast state UP group default qlen 1000 link/ether fa:16:3e:02:02:93 brd ff:ff:ff:ff:ff:ff inet 172.16.1.11/24 brd 172.16.1.255 scope global dynamic eth1 valid_lft 83687sec preferred_lft 83687sec inet 10.75.117.72/32 scope global eth1 valid_lft forever preferredlft forever
After this, during an existing load session where traffic is being pumped, if I reboot the worker-01, the VIP gets plumbed on the other node in 2-3 seconds, but when the client tries to connect again, it's unable to do that unless the service becomes available after 23 secs and then the traffic resumes.
_Wed 2 Jan 07:15:14 UTC 2019 tcp 0 584 10.75.117.81:52702 10.75.117.72:3868 ESTABLISHED Wed 2 Jan 07:15:15 UTC 2019 tcp 0 1109 10.75.117.81:52702 10.75.117.72:3868 FIN_WAIT1 - This is at the time of reboot of worker-01._
Then.... _Wed 2 Jan 07:15:17 UTC 2019 tcp 0 1 10.75.117.81:52700 10.75.117.72:3868 SYN_SENT tcp 0 1109 10.75.117.81:52702 10.75.117.72:3868 FINWAIT1 ..... and so on .....
Finally, _Wed 2 Jan 07:15:38 UTC 2019 tcp 0 1 10.75.117.81:52692 10.75.117.72:3868 SYN_SENT tcp 0 1 10.75.117.81:52700 10.75.117.72:3868 SYN_SENT tcp 0 0 10.75.117.81:52690 10.75.117.72:3868 ESTABLISHED tcp 0 1 10.75.117.81:52696 10.75.117.72:3868 SYN_SENT tcp 0 1 10.75.117.81:52694 10.75.117.72:3868 SYN_SENT tcp 0 1 10.75.117.81:52698 10.75.117.72:3868 SYN_SENT tcp 0 1109 10.75.117.81:52702 10.75.117.72:3868 FINWAIT1
So, it shows that it took 23 seconds for the connection to be established finally. In certain cases I have seen that number to be even 30-40 seconds.
Since my service is exposed via NodePort and the IP movement also happened within 2-3 seconds, should not the service be available immediately after the IP movement since one of the app POD on the leftover worker node can serve traffic. I tried this test numerous times to see if I am missing something. Finally, I tried https://github.com/munnerz/keepalived-cloud-provider where we use LoadBalancer rather than NodePort, and I could get the service reach-ability within 4 seconds after a node reboots. But, alas we can use the LB concept only in BM and not on cloud.
Can you please check this? Such a high failover time on node reboot certainly won't fit the bill.
PS: Even when exposing the service via ExternalIP, I see the same behaviour. PS: When I kill one of the TWO app PODs, rather than node reboot, the service is almost instantly available (since it did not have to plumb the IP).