telepresenceio / telepresence

Local development against a remote Kubernetes or OpenShift cluster
https://www.telepresence.io
Other
6.61k stars 521 forks source link

Can not intercept after upgrade #3698

Closed jckos closed 1 month ago

jckos commented 1 month ago

After upgrade (from 2.17) to 2.20 we can not intercept the Service anymore. In the old version we had Argo Rollouts disabled and it was working okay. Now the desired service is not shown in the telpresence list (only consumers are shown there, with old version they were not listed at all) with enabling Argo rollouts I can see the service name (tpt-front - see screens).

Telepresence 2.20 (telepresence list -o with rollouts enabled):

Screenshot 2024-10-07 at 20 54 33

Telepresence 2.17 (telepresence list -o with rollouts disabled):

Screenshot 2024-10-07 at 21 08 08

telepresence intercept tpt-front --workload "tpt-front-5fd89c96f9" --service "tpt-front" --port 80:80 --mount=false :

Screenshot 2024-10-07 at 21 09 39

This is working also in 2.18. In 2.19.1 it stopped working (the intercept). The list is the same. In 2.19.1 it returns also: telepresence intercept: error: connector.CreateIntercept: request timed out while waiting for agent tpt-front-5fd89c96f9.tpt-eoc-k3w to arrive

The Rollout :

Screenshot 2024-10-07 at 21 08 53

The logs from 2.20:

telepresence_logs_2024-10-07T20:50:07+02:00.zip

thallgren commented 1 month ago

@jckos we introduced support for Argo Rollouts in 2.20.0 (as an opt-in), and what you're experiencing here might be a regression caused by that, but I'd like to know more about what you mean with "in the old version we had Argo Rollouts disabled". Can you please elaborate on that? Are you intercepting the ReplicaSet directly?

jckos commented 1 month ago

I see, so the switch is not in the old version at all. Yes as shown on the screenshot, telepresence list -o returns the ReplicaSets and I can intercept them. In the new version it does not return them. I can see the Rollouts (when Rollouts enabled) but I can not intercept them running: telepresence intercept tpt-front --workload "tpt-front-5fd89c96f9" --service "tpt-front" --port 80:80 --mount=false

where 'tpt-front-5fd89c96f9' is that ReplicaSet.

thallgren commented 1 month ago

Aha, ok. I'll investigate this further.

thallgren commented 1 month ago

With rollouts enabled, I think you're supposed to intercept the actual ArgoRollout workload, not the ReplicatSet.

jckos commented 1 month ago

That is not working: telepresence intercept tps-front --workload "tps-front" --service "tps-front" --port 80:80 --mount=false telepresence intercept: error: connector.CreateIntercept: found no service with a port that matches a container in pod .tps-eoc-k3w

thallgren commented 1 month ago

That's odd, and perhaps another problem? What does the service port declaration look like? And what does the port declaration in the targeted container look like?

jckos commented 1 month ago

The Service and ReplicaSet Pod attached (we are trying redirect the nginx trafic to local at 80, it stopped to work in 19.1). service.zip. It seems that the traffic agent can not be injected. It always stops the pods in ReplicaSet and creates new but without the agent. And I did not see anything more in the Traffic Manager Pod logs.

I have also tried to put the annotation telepresence.getambassador.io/inject-traffic-agent: enabled and then telepresence intercept tps-front --workload "tps-front-7475d57cc9" --service "tps-front" --mount=false fails too.

In the logs I only saw some issues with Probes (but i disabled them in the Deployment later too with same results) e.g.: httpd/conn=127.0.0.1:8081 : Warning Unhealthy Liveness probe failed: : session_id="982b8f80-0a69-4105-bfc2-3f20b233b69b"

thallgren commented 1 month ago

@jckos I'm currently working on a 2.20.1 release and I think I've solved the first problem where the replicaset doesn't show up in the list. It should show up now, unless you enable ArgoRollouts, in which case it shouldn't (because it's then owned by a supported workload).

Can you do a test with this release, and just before you reproduce the error, also please do telepresence loglevel debug? I'm mostly interested in the debug-logs from the traffic-manager because it should tell us more about why no matching pod is found for the intercepted service.

The 2.20.1-rc.0 release is available for download.

jckos commented 1 month ago

yes, the RS is shown in the list but the intercept fails. telepresence intercept tpt-front-579bb4d484 --service "tpt-front" --mount=false telepresence_logs_2024-10-09T22:23:58+02:00.zip

thallgren commented 1 month ago

Thanks. I see that this is similar to the regression as I just fixed in the client, but this time it's the traffic-manager code. It should really be satisfied with the ReplicaSet owner here, but it continues to the ArgoRollout and then fails, because argo-rollouts aren't enabled. I'll look into this now.

2024-10-09 20:19:09.0593 debug   agent-injector : Handling admission request CREATE tpt-front-579bb4d484-.tpt-eoc-k3w
2024-10-09 20:19:09.0593 debug   agent-injector : FindOwnerWorkload(tpt-front-579bb4d484-,tpt-eoc-k3w,Pod)
2024-10-09 20:19:09.0593 debug   agent-injector : GetWorkload(tpt-front,tpt-eoc-k3w,Deployment)
2024-10-09 20:19:09.0593 debug   agent-injector : GetWorkload(tpt-front-579bb4d484,tpt-eoc-k3w,ReplicaSet)
2024-10-09 20:19:09.0593 debug   agent-injector : FindOwnerWorkload(tpt-front-579bb4d484,tpt-eoc-k3w,ReplicaSet)
2024-10-09 20:19:09.0593 debug   agent-injector : GetWorkload(tpt-front,tpt-eoc-k3w,Rollout)
2024-10-09 20:19:09.0594 debug   agent-injector : No workload owner found for pod tpt-front-579bb4d484-.tpt-eoc-k3w

Would it be possible for you to provide similar logs when having argo-rollouts enabled and attempting to intercept the actual rollout?

thallgren commented 1 month ago

I just published 2.20.1-rc.1 with some additional fixes.

jckos commented 1 month ago

Voila, it is working now (Version : v2.20.1-rc.1): Using ReplicaSet tps-front-5dc84dbd97 Intercept name : tps-front-5dc84dbd97 State : ACTIVE Workload kind : ReplicaSet Destination : 127.0.0.1:8080 Service Port Identifier: 80/TCP Intercepting : all TCP connections

Thank You!