voyagermesh / voyager

🚀 Secure L7/L4 (HAProxy) Ingress Controller for Kubernetes
https://voyagermesh.com
Apache License 2.0
1.35k stars 134 forks source link

Prometheus Monitoring causes panic "invalid memory address or nil pointer dereference" #1575

Open dgradl-fl opened 3 years ago

dgradl-fl commented 3 years ago

When using the monitoring described here: https://voyagermesh.com/docs/v12.0.0/guides/ingress/monitoring/using-coreos-prometheus-operator/ for v12.0.0 - it crashes the operator. See attachment. From what I can tell the parser is parsing the ingress.appscode.com/service-monitor-labels into Labels, but the monitoring api agent is looking for ServiceMonitor.Labels https://github.com/voyagermesh/voyager/blob/v12.0.0/vendor/kmodules.xyz/monitoring-agent-api/agents/coreosprometheusoperator/lib.go#L87

Looks like there might be an API mismatch here.

crash.txt

samispurs commented 3 years ago

Received the same using version 12.0.0 and using ingress.appscode.com/monitoring-agent: 'prometheus.io/builtin'

crash.txt

Is this fixed in the newer version that requires an enterprise license?

sFrenkie commented 2 years ago

@dgradl-fl, @samispurs I've hit the similar issue. I've noticed that I forgot to fill all required keys in annotations. After fill all required keys everything works. see doc https://voyagermesh.com/docs/v12.0.0/guides/ingress/monitoring/using-coreos-prometheus-operator/

dgradl-fl commented 2 years ago

Are you sure about that? I just set it up with: ingress.appscode.com/monitoring-agent: prometheus.io/coreos-operator ingress.appscode.com/service-monitor-labels: '{"app": "voyager"}' ingress.appscode.com/service-monitor-namespace: mynamespace ingress.appscode.com/stats: "true" And the ingress continues to work - it even adds the exporter sidecar to the pod. But if you look at the logs of operator you will see the original error I posted. And it never creates the custom resource "ServiceMonitor" that prometheus uses to add the scraping of metrics.

sFrenkie commented 2 years ago

Not 100%

My voyager operator was down caused the issue. (spinning in restart lopp with delays and unable to proces any new request) When I fixed annotations I was not able to apply change because operator was down in that time. So I've created more replicas of operator and I've restarted whole deployment of operator just befory apply fix.

tamalsaha commented 2 years ago

@sFrenkie , what version of operator are you using?

I was not able to apply change because operator was down in that time.

The latest version fixes issues like this by running the validator and operator as separate containers.

sFrenkie commented 2 years ago

@tamalsaha We use version 12. Pods with haproxy have sidecar container with exporter.

tamalsaha commented 2 years ago

Can you try with v14.0.0 ? This version uses HAProxy's built-in exporter.