microsoft / retina

eBPF distributed networking observability tool for Kubernetes
https://retina.sh
MIT License
2.69k stars 201 forks source link

Kubernetes cannot scrape the operator pod (tries port 80, but the operator uses port 8080)? #738

Open rbtr opened 2 weeks ago

rbtr commented 2 weeks ago

Discussed in https://github.com/microsoft/retina/discussions/734

Originally posted by **kastl-ars** September 12, 2024 Hi all, I just installed Retina and kube-prometheus-stack according to the documentation. But one of the retina-pods targets is unhealthy according to Prometheus. It tries to get the metrics for the operator pod on port 80: ``` Get "http://10.42.0.16:80/metrics": dial tcp 10.42.0.16:80: connect: connection refused ``` However, the operator pods seems to listen on port 8080? ``` [...] ts=2024-09-12T11:23:41.024Z level=info caller=legacy/deployment.go:251 msg="Starting manager" 2024-09-12T11:23:41.024Z info controller-runtime.metrics Starting metrics server 2024-09-12T11:23:41.024Z info controller-runtime.metrics Serving metrics server {"bindAddress": ":8080", "secure": false} 2024-09-12T11:23:41.024Z info starting server {"name": "health probe", "addr": "[::]:8081"} [...] ``` This is the values.yaml I uses for the installation: ``` image: tag: 'v0.0.16' operator: enabled: true tag: 'v0.0.16' enableRetinaEndpoint: true loglevel: 'info' enabledPlugin_linux: "[dropreason,packetforward,linuxutil,dns,packetparser]" enablePodLevel: true remoteContext: true ``` Any ideas? Kind Regards, Johannes
whatnick commented 1 week ago

The legacy helm chart added a ServiceMonitor via #695 which should be the recommended means for adding scraper jobs for Prometheus for all the retina-agent pods in the daemonset. If the service is created properly and all agent pods run on port 8080. For clarification is this hubble helm chart ?