nginxinc / nginx-service-mesh

A service mesh powered by NGINX Plus to manage container traffic in Kubernetes environments.
https://docs.nginx.com/nginx-service-mesh
Apache License 2.0
93 stars 30 forks source link

nginx-meshctl top does not work #78

Open darkn3rd opened 1 year ago

darkn3rd commented 1 year ago

STEPS

  1. Run through tutorial with https://docs.nginx.com/nginx-service-mesh/tutorials/observability/
  2. Deploy some applications that are integrated into the mesh
  3. Run nginx-meshctl top

I used helmfile to deploy what was in the docs, except that I use manual injection and enabled strict mTLS mode.

cat << EOF > helmfile.yaml
repositories:
  # https://artifacthub.io/packages/helm/nginx/nginx-service-mesh
  - name: nginx-stable
    url: https://helm.nginx.com/stable

releases:
  - name: nsm
    namespace: nginx-mesh
    chart: nginx-stable/nginx-service-mesh
    values:
      - prometheusAddress: prometheus.nsm-monitoring.svc:9090
        telemetry:
          exporters:
            otlp:
              host: otel-collector.nsm-monitoring.svc
              port: 4317
          samplerRatio: 1
        tracing: null
        # 'allow' or 'deny'
        accessControlMode: {{ env "NSM_ACCESS_CONTROL_MODE" | default "allow" }}
        mtls:
          # 'strict' or 'permissive'
          mode: {{ env "NSM_MTLS_MODE" | default "strict" }}
        autoInjection:
          {{- if eq (env "NSM_AUTO_INJECTION") "true" }}
          disable: false
          disabledNamespaces:
            - nsm-monitoring
          {{- else }}
          disable: true
          {{- end }}
EOF

helmfile apply

Expect Results

There would be data

Actual Results

Cannot build traffic statistics.
Error: no metrics populated - make sure traffic is flowing
sjberman commented 1 year ago

Was traffic consistently being sent through your applications? It can take a bit of time for the statistics to be accumulated, and consistent traffic flow is required to get those statistics.

darkn3rd commented 1 year ago

I tried is different circumstances, and I get nothing. I am not sure how this is suppose to work (as it isn't open source), so I am not sure what to look for in this case. I went through the installing observability tutorial (doc https://docs.nginx.com/nginx-service-mesh/tutorials/observability/) as well.

I also tried walking the access control tutorial that instructs using this tool (dic https://docs.nginx.com/nginx-service-mesh/tutorials/accesscontrol-walkthrough/), and couldn't get nginx-meshctl to work.

sjberman commented 1 year ago

Do you see nginxplus_upstream_server_responses and nginxplus_upstream_server_response_latency_ms_bucket metrics in Prometheus?

The top command basically queries the nginx-mesh-metrics server, which queries and formats these metrics from Prometheus. We have found the results to be finicky in the past.

darkn3rd commented 1 year ago

I would have to run a test again, to see if I can see those metrics in prometheus.

f5-todd commented 1 year ago

@darkn3rd FYI...In the time since you have created this issue, we have open-sourced our CLI tool, nginx-meshctl. That doesn't solve your issue, but I thought you might like to look at the code since you asked about it.