prometheus-community / helm-charts

Prometheus community Helm charts
Apache License 2.0
5k stars 4.99k forks source link

[kube-prometheus-stack] Grafana with TLS/ingress doesn't pick up custom Prometheus datasource #3100

Closed ceelias closed 1 year ago

ceelias commented 1 year ago

Describe the bug a clear and concise description of what the bug is.

While attempting to deploy Grafana with TLS/Ingress and updated values of Prometheus datasource, I am seeing inconsistent results but generally not seeing any datasource configured at all

What's your helm version?

version.BuildInfo{Version:"v3.11.0", GitCommit:"472c5736ab01133de504a826bd9ee12cbe4e7904", GitTreeState:"clean", GoVersion:"go1.19.5"}

What's your kubectl version?

Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean" , BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"darwin/arm64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean" , BuildDate:"2022-11-09T13:29:58Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"linux/amd64"}

Which chart?

kube-prometheus-stack

What's the chart version?

43.3.1

What happened?

'm attempting to deploy Grafana with TLS enabled (grafana.ini updated with protocol: https) and ingress along with loading a Prometheus datasource. This deploys fine and works as expected. The issue is when I navigate to the data sources in the Grafana UI, I see 1 of 3 things:

  1. No data source appears at all
  2. The default data source appears without the configuration I need
  3. The datasource I actually need is configured properly as I would expect

I am creating a separate configmap that I then attach the label grafana_datasource=1 to. I know this works fine when I am not attempting to update the grafana.ini with ingress or TLS enabled so I'm unsure why it is failing to work now. As I mentioned, it will occasionally/randomly work when I deploy which makes the issue a bit more odd for me.

In another attempt to solve my use case, I have updated the default datasource to use the appropriate url I need using https://, but I would also like the ability to toggle tlsSkipVerify which does not appear to be a supported configuration for the default data source.

I see no errors in the logs related to these changes or anything that would indicate an issue to me. I am able to manually add the datasource through the UI and it has not issue.

I am able to confirm that my datasource configmap gets created with the correct label but I do see another configmap that appears to be created by default v4m-grafana-datasource that is completely empty with the correct label as well. I wonder if somehow this configmap is getting picked up first and is the one causing the issue.

What you expected to happen?

I expect Grafana to get deployed with TLS and Ingress enabled with my Prometheus datasource configured.

How to reproduce it?

Deploy Grafana with TLS/ingress configured and a custom Prometheus datasource

Enter the changed values of values.yaml?

grafana:
  ingress:
    annotations:
      kubernetes.io/ingress.class: nginx
      nginx.ingress.kubernetes.io/backend-protocol: HTTPS
    enabled: true
    tls:
    - hosts:
      - ingress-nginx.webb-m1.opsmonitor.sashq-d.openstack.sas.com
      secretName: grafana-ingress-tls-secret
    hosts:
    - ingress-nginx.webb-m1.opsmonitor.sashq-d.openstack.sas.com
    path: /grafana
  testFramework:
    enabled: false
  readinessProbe:
    httpGet:
      scheme: HTTPS
      port: 3000
  livenessProbe:
    httpGet:
      scheme: HTTPS
      port: 3000
  extraSecretMounts:
  - name: grafana-tls
    mountPath: /cert
    secretName: grafana-tls-secret
    readOnly: true
    subPath: ""
  service:
    port: 3000
    targetPort: 3000
    type: ClusterIP
    nodePort: null
  sidecar:
    datasources:
      defaultDatasourceEnabled: false
  "grafana.ini":
    server:
      protocol: https
      cert_file: /cert/tls.crt
      cert_key: /cert/tls.key
      domain: ingress-nginx.webb-m1.opsmonitor.sashq-d.openstack.sas.com
      root_url: https://ingress-nginx.webb-m1.opsmonitor.sashq-d.openstack.sas.com/grafana
      serve_from_sub_path: true

Datasource configmap(grafana-datasource-prom-https.yaml):

    apiVersion: 1
    datasources:
    - name: Prometheus
      type: prometheus
      access: proxy
      isDefault: true
      jsonData:
        tlsSkipVerify: true
      editable: true
      url: https://v4m-prometheus:9090/prometheus

Enter the command that you execute and failing/misfunctioning.

Configmap is created:

  kubectl create cm -n monitoring grafana-datasource-prom-https --from-file monitoring/tls/grafana-datasource-prom-https.yaml
  kubectl label cm -n monitoring grafana-datasource-prom-https grafana_datasource=1

No failure in logs or returned to me on deploy

Anything else we need to know?

Willing to provide any other information that would make the issue more clear. Thanks for looking!

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Stealthmate commented 1 year ago

I think this might be related to https://github.com/grafana/grafana/issues/12878 . I am doing a similar setup and grafana isn't loading the default datasource, even though/etc/grafana/provisioning/datasources/datasource.yaml and grafana.ini points to the correct paths. Maybe the reason it sometimes works for OP is a race condition (as pointed out in the grafana issue)?

EDIT: I just tried forcing grafana to restart via kubectl exec -n kube-prometheus-stack -it deployment/kube-prometheus-stack-grafana -c grafana -- /bin/sh -c 'kill 1' and now the default datasource shows up. More evidence in support of https://github.com/grafana/grafana/issues/12878 I guess.

zeritti commented 1 year ago

With respect to datasources that are defined in configmaps, these get collected by the datasource sidecar grafana.sidecar.datasources.enabled implemented with k8s-sidecar. If

its requests to the Grafana's reload endpoint fail and hence, Grafana does not know of the new datasources until it gets redeployed and finds new datasources via provisioning directory. This might be the reason why a datasource from a configmap shows occasionally up.

The default Prometheus datasource with grafana.sidecar.datasources.defaultDatasourceEnabled gets also collected by the sidecar, and similarly grafana.additionalDatasources.

Another means to define custom datasources is through grafana.datasources (Ref.). In this case, a change to the datasources initiates a redeployment on release upgrade.

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue is being automatically closed due to inactivity.