k8sgpt-ai / k8sgpt-operator

Automatic SRE Superpowers within your Kubernetes cluster
https://k8sgpt.ai
Apache License 2.0
315 stars 89 forks source link

[BUG]: old pod IP is used when k8sgpt-deployment pod is restarted #136

Closed jkleinlercher closed 1 year ago

jkleinlercher commented 1 year ago

Checklist

Affected Components

K8sGPT Version

v0.0.3

Kubernetes Version

No response

Host OS and its Version

No response

Steps to reproduce

  1. restart k8sgpt-deployment pod
  2. look at operator logs:

kubectl logs deployment/k8sgpt-k8sgpt-operator-controller-manager

2023-05-31T20:03:39Z    ERROR   Reconciler error        {"controller": "k8sgpt", "controllerGroup": "core.k8sgpt.ai", "controllerKind": "K8sGPT", "K8sGPT": {"name":"k8sgpt-sample","namespace":"sx-k8sgpt"}, "namespace": "sx-k8sgpt", "name": "k8sgpt-sample", "reconcileID": "c5a51cb1-4a4b-49f8-af9f-db04e0e3f057", "error": "failed to call Analyze RPC: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp 10.130.3.224:8080: connect: no route to host\""}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler

Expected behaviour

the controller should be resilient against k8sgpt-deployment pod restarts. why do we use the pod IP in https://github.com/k8sgpt-ai/k8sgpt-operator/blob/075caf5e0999d4ef92027656f3d26ab8bdbfcdef/controllers/k8sgpt_controller.go#L182 instead of the k8sgpt service?

Actual behaviour

No response

Additional Information

No response

AlexsJones commented 1 year ago

There was a DNS resolution issue when I was building this, so we can probably alter it back, ill take a look