telepresenceio / telepresence

Local development against a remote Kubernetes or OpenShift cluster
https://www.telepresence.io
Other
6.61k stars 521 forks source link

"panic: no ImageRetriever has been configured error" from traffic-manager #3706

Closed bbensky closed 1 month ago

bbensky commented 1 month ago

Describe the Bug

Running telepresence status or telepresence version from the local client causes the traffic-manager in the cluster to crash.

To Reproduce We deployed the telepresence v2.20.1 OSS chart into a Kubernetes cluster, and are running v2.20.1 of the client locally.

We are only using telepresence for the proxy/forwarding feature, and restricting traffic-manager to specific namespaces. So the only changes to the default Helm values are:

agentInjector:
  enabled: false
managerRbac:
  namespaced: true
  namespaces:
    - "traffic-manager"
    - "staging"

We connect our local client to the cluster with: telepresence connect --namespace staging --manager-namespace traffic-manager, and then run telepresence status.

The traffic-manager pod crashes and restarts after these logs:

traffic-manager-76955fc599-d2fxt traffic-manager 2024-10-11 20:29:39.7014 debug   httpd/conn=127.0.0.1:8081 : LookupDNS on traffic-manager: lb._dns-sd._udp.traffic-manager. PTR -> SERVFAIL rpc error: code = Internal desc = "lb._dns-sd._udp.traffic-manager." is neither a valid IP-address or a valid reverse notation : session_id="35bb9aa1-2f5d-4c44-a20a-ba0a725766e9"
traffic-manager-76955fc599-d2fxt traffic-manager 2024-10-11 20:33:01.9509 info    Logging at this level "info"
traffic-manager-76955fc599-d2fxt traffic-manager 2024-10-11 21:03:40.0635 error   httpd/conn=127.0.0.1:8081 : goroutine 8440 [running]:
traffic-manager-76955fc599-d2fxt traffic-manager runtime/debug.Stack()
traffic-manager-76955fc599-d2fxt traffic-manager        runtime/debug/stack.go:26 +0x5e
traffic-manager-76955fc599-d2fxt traffic-manager github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/managerutil.GetAgentImage({0x3830388, 0xc0012de060})
traffic-manager-76955fc599-d2fxt traffic-manager        github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/managerutil/agentimage.go:76 +0x65
traffic-manager-76955fc599-d2fxt traffic-manager github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager.(*service).GetAgentImageFQN(0x4cdb460?, {0x3830388?, 0xc0012de060?}, 0x12aad05?)
traffic-manager-76955fc599-d2fxt traffic-manager        github.com/telepresenceio/telepresence/v2/cmd/traffic/cmd/manager/service.go:142 +0x25
traffic-manager-76955fc599-d2fxt traffic-manager github.com/telepresenceio/telepresence/rpc/v2/manager._Manager_GetAgentImageFQN_Handler({0x34600a0, 0xc0011fd9e0}, {0x3830388, 0xc0012de060}, 0xc0010f3400, 0x0)
traffic-manager-76955fc599-d2fxt traffic-manager        github.com/telepresenceio/telepresence/rpc/v2@v2.20.1/manager/manager_grpc.pb.go:1014 +0x1a6
traffic-manager-76955fc599-d2fxt traffic-manager google.golang.org/grpc.(*Server).processUnaryRPC(0xc001372000, {0x3830388, 0xc0007b0570}, {0x3841080, 0xc000ab29a0}, 0xc0006517a0, 0xc001351170, 0x4cfbf18, 0x0)
traffic-manager-76955fc599-d2fxt traffic-manager        google.golang.org/grpc@v1.67.0/server.go:1394 +0xe2b
traffic-manager-76955fc599-d2fxt traffic-manager google.golang.org/grpc.(*Server).handleStream(0xc001372000, {0x3841080, 0xc000ab29a0}, 0xc0006517a0)
traffic-manager-76955fc599-d2fxt traffic-manager        google.golang.org/grpc@v1.67.0/server.go:1805 +0xe8b
traffic-manager-76955fc599-d2fxt traffic-manager google.golang.org/grpc.(*Server).serveStreams.func2.1()
traffic-manager-76955fc599-d2fxt traffic-manager        google.golang.org/grpc@v1.67.0/server.go:1029 +0x7f
traffic-manager-76955fc599-d2fxt traffic-manager created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 8455
traffic-manager-76955fc599-d2fxt traffic-manager        google.golang.org/grpc@v1.67.0/server.go:1040 +0x125
traffic-manager-76955fc599-d2fxt traffic-manager
traffic-manager-76955fc599-d2fxt traffic-manager panic: no ImageRetriever has been configured

Aside from this issue, we are able to use the port-forward functionality (i.e. we can curl endpoints of services in the staging namespace).

Expected behavior Based on the instructions here, the output of telepresence status should include the Traffic Agent image at the bottom of the output (https://www.getambassador.io/docs/telepresence/latest/howtos/outbound#proxying-outbound-traffic). The output when we run the command stops at the Version.

Versions (please complete the following information):

thallgren commented 1 month ago

I'm able to reproduce this. It's a regression introduced when the status and version command requests the traffic-agent version from the traffic-manager, and with agentInjector.enabled=false, there's no such thing as a traffic-agent.

I'll create a patch release with a fix for this a.s.a.p.

thallgren commented 1 month ago

The 2.20.2-rc.0 release candidate is available for download now, in case you want to give it a try.

Please also note that your expectation is incorrect. The "Traffic Agent : not currently available" is to be expected when you're using agentInjector.enabled=false.

bbensky commented 1 month ago

Apologies for the terminology mistake. I tried the release candidate and it looks like the traffic-manager pod is no longer erroring and crashing when I run telepresence status and telepresence version` commands. Thanks for getting the fix up so quickly.