Open hterik opened 2 years ago
You need to have TCP keep alives or some small amount of TCP traffic to keep the connection alive.
However, it's also worth noting that exec
was never really meant for long-running connections, there are much better ways to achieve the same thing via other types of APIs.
Thank you.
Where does one enable TCP keep alive? This is normally a configuration on the socket from what i know, but such options are not exposed all the way up to the kubernetes_client apis. I can see that the WebSocket
constructor takes sockopt
argument as input, but the kubernetes.stream.ws_client.create_websocket
does not set this argument.
Also curios about which alternative APIs you suggest that are better than exec
? (I understand it's better that pods isolate their own workloads, but we have some scenarios where where we have to initiate longer operations via external triggers on already running containers, that take very long to load their dataset and re-initialize otherwise. Building our own application-level API in the container is something we want to avoid, and not really possible since the application is from third-party.)
If exec is not supported, it would be good if the limitations using it was explained in docs.
Can add that we managed to get a stable solution by adding pings. We also had to replace the readline_stdout(timeout=1)
with update()
+ peek_stdout()
+ read_stdout()
instead, so exactly as the example in https://github.com/kubernetes-client/python/blob/master/examples/pod_exec.py, plus pings. While this works, it's very intricate and easy to get wrong, an easier solution would be good.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle rotten
@hterik Are you able to share your solution? I am facing the same issue.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
/remove-lifecycle rotten
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/reopen
@hterik: Reopened this issue.
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
/reopen /remove-lifecycle rotten
@hterik: Reopened this issue.
What happened (please include outputs or screenshots): After exactly 300sec of idle, exec commands streamed with
stream
loose their connection, even thoughkubectl exec POD_NAME ps aux
)Following stack trace is given
What you expected to happen:
How to reproduce it (as minimally and precisely as possible): (Mostly inspired by https://github.com/kubernetes-client/python/blob/master/examples/pod_exec.py)
Anything else we need to know?: I've experimented by adding
s.sock.ping()
in each iteration of thetail_logs
function and it appears to improve the situation but some times it still get stuck inside readline_stdout, despite thetimeout=1
parameter being set. :raised_eyebrow: It get stuck so long that ping never has chance to run within the next 300sec timeout. Does one need to runping
in a separate thread for this to work? That's how it is done inside theWebSocketApp
, as used by the long lived connection example from websocket library. Can one use WebSocketapp together with stream or do i need to implement similar ping-feature myself when using stream?I'm also curious where 300sec come from, is there a way to find this time through connection handshake or is it a setting in my cluster/azure?
Environment: