Closed r7vme closed 6 years ago
tl;dr problem not in kubernetes client, but in load balancer settings.
I did some research on geckon
and looks like root cause of this issue not in k8s client timeout, but in fact that we use load balancers and TCP connection just stays in ESTABLISHED state for a while (Until load balancer drops it). This reduces to zero benefits from tcp keepalive that is used by k8s.
Change that was done for this issue, is beneficial only in case slow/freezed api server and works on http level (not tcp) and does not fixes the initial issue. But i'll leave this change as it reasonable timeout for our use case.
Back to initial issue. Initial timeout was ~11 minutes on geckon
. geckon
uses haproxy as load balancer. And had 10 minutes for client and server timeout. This was the reason of ~11 minutes.
Set quite small timeouts for regular tcp connections - 30 sec and use 1 hour for tunneled connection (e.g kubectl exec
kubectl logs
)
timeout client 30s
timeout client-fin 30s
timeout tunnel 1h
timeout server 30s
Reopening until released.
There was issue when guest cluster API was not available and controller(k8s client) was waiting for response for few minutes. To avoid this we need to set aggressive http timeout e.g. 30 sec.
https://github.com/kubernetes/client-go/blob/master/rest/config.go#L114