Closed ofiryy closed 3 years ago
There is a setting for prometheus and alertmanager
storage:
volumeClaimTemplate:
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: sc-mirror
resources:
requests:
storage: 300Mi
I think we should have the same for grafana ?
This is not a bug as data persistence is not enabled by default. You can either claim a PersistentVolume in your custom values.yaml file like @survivant suggested or export your dashboards as JSON definition files and create a ConfigMap with the JSON-formatted data for each custom dashboard. This way with each new release of the stack via helm, the modifications within Grafana do not persist but your exported dashboards get redeployed with everything else.
@ofiryy I update the yaml: values.yaml add:
grafana:
persistence:
enabled: true
to fix the grafana persistent problem.
@blademainer but we still can't choose our storage class
@survivant prometheus-community/kube-prometheus-stack
chart uses the grafana/grafana
chart as a dependency. So any values you can pass to grafana/grafana
you can pass to the grafana
object in this chart. Or am I misunderstanding the issue being raised?
This works for me
grafana:
enabled: true
persistence:
enabled: true
type: pvc
storageClassName: default
accessModes:
- ReadWriteOnce
size: 4Gi
finalizers:
- kubernetes.io/pvc-protection
@BertelBB thank you. I don't know what I did wrong at that time.. but it works fine. Now, I need to find a workaround for https://github.com/prometheus-community/helm-charts/issues/437
I'm not sure the workflow of expecting all the grafana settings to get zapped on the next stopping of a pod has got the best interests of the enterprise in mind. I get the argument of exporting the charts as JSON and storing into configMaps to make them deployment agnostic, but there are other settings not related to charts that we don't want to have disappear when a pod crashes either (such as user login information, settings around alerting, and so forth). So, unless there is a best practice for storing all of that into configmaps as well (and a good user UI for how to do that, which doesn't require kubectl and a Kubernetes admin), it seems shortsighted to think that Grafana can live in an enterprise environment as an application that doesn't require persistence. It seems the opposite would be true.
I too am wringing out the kinks of my Prometheus install and ran into this exact same problem of grafana not supporting persistence out of the box. It was rather alarming to learn that after I began building out dashboards, I lost that work when I tested out the failover scenarios of the pod going down. I did not see a persistence piece in the grafana part of the values.yaml and didn't know that this would turn grafana into an app with a temporary persistence layer.
In hind sight, I should have done my pod failover test first before beginning to "persist" data in grafana to learn about this annoying default. I do wish that the helm chart can be upgraded to have a section under grafana that allows the ability to define the persistence layer...even if its commented out.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
For anyone who's looking - kube-prometheus stack uses values from
Probably this should be included in docs..
@survivant
prometheus-community/kube-prometheus-stack
chart uses thegrafana/grafana
chart as a dependency. So any values you can pass tografana/grafana
you can pass to thegrafana
object in this chart. Or am I misunderstanding the issue being raised?This works for me
grafana: enabled: true persistence: enabled: true type: pvc storageClassName: default accessModes: - ReadWriteOnce size: 4Gi finalizers: - kubernetes.io/pvc-protection
I have used your code snippet, but I'm facing issue:
Warning FailedScheduling 6s (x5 over 83s) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
k get pods -n prometheus
NAME READY STATUS RESTARTS AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 19h
prometheus-grafana-5d9946dff9-4ffgc 0/2 Pending 0 2m10s
prometheus-grafana-669fbc79f9-dmmhk 2/2 Running 0 3m58s
prometheus-kube-prometheus-operator-85ccf48856-q8n68 1/1 Running 0 12m
prometheus-kube-state-metrics-6dc7f98565-twkxk 1/1 Running 0 12m
prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 1 19h
prometheus-prometheus-node-exporter-bgqtb 1/1 Running 0 19h
I wonder how I can fix it.
@darox The issues is that your PVC is already bound to the pod prometheus-grafana-669fbc79f9-dmmhk
, so the new grafana
pod cannot claim the PV and therefore fails to start.
A quick fix would be to delete the ReplicaSet for the older grafana
pod, i.e. kubectl delete rs prometheus-grafana-669fbc79f9 -n prometheus
.
A permanent fix would be to make sure that two grafana
pods cannot be running at the same time. So your rolling update strategy should ensure that when a grafana
upgrade is in progress, the scheduler first kills the old pod before starting the new one. I'm no expert in update strategies, but I think this should work
EDIT: Previous strategy was wrong, this one works.
grafana:
deploymentStrategy:
type: Recreate
This strategy will ensure the old grafana
pod is terminated before starting a new one, which will result in a short downtime for Grafana during upgrades.
I have applied your recommendations:
k get pods -n prometheus
NAME READY STATUS RESTARTS AGE
alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 0 18s
prometheus-grafana-6fb7f46b9c-5ph99 0/2 Pending 0 22s
prometheus-kube-prometheus-operator-548f79bb9-hskjx 1/1 Running 0 22s
prometheus-kube-state-metrics-5b8f9bdbbd-tr8vq 1/1 Running 0 22s
prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 1 18s
prometheus-prometheus-node-exporter-k9nzm 1/1 Running 0 22s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 24s (x6 over 111s) default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
@darox is the prometheus-grafana
(default name) PVC marked as Bound
and if so what pod is it being used by?
kubectl get pvc -n prometheus
kubectl describe pvc -n prometheus prometheus-grafana (replace name if needed)
Do you in fact have a default
StorageClass?
kubectl get sc
It worked with:
grafana:
deploymentStrategy:
type: Recreate
persistence:
enabled: true
type: pvc
storageClassName: hostpath
accessModes:
- ReadWriteOnce
size: 4Gi
finalizers:
- kubernetes.io/pvc-protection
k get pvc -n prometheus
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
prometheus-grafana Bound pvc-d7ec8849-db92-4ec7-a465-f7ff67e414cb 4Gi RWO hostpath 41s
prometheus-prometheus-kube-prometheus-prometheus-db-prometheus-prometheus-kube-prometheus-prometheus-0 Bound pvc-4009c793-9d44-4a7b-ab4b-13af00c513ad 5Gi RWO hostpath 4d2h
Thanks a lot for your support :)
For some reason it doesn't work for me, got such values:
## helm upgrade --install prometheus prometheus-community/kube-prometheus-stack --values values.yml
kube-state-metrics:
image:
repository: k8s.gcr.io/kube-state-metrics-arm64
tag: v1.9.5
prometheus:
prometheusSpec:
podMonitorSelectorNilUsesHelmValues: false
serviceMonitorSelectorNilUsesHelmValues: false
grafana:
adminPassword: xxx
deploymentStrategy:
type: Recreate
enabled: true
persistance:
enabled: true
type: pvc
storageClassName: default
accessModes:
- ReadWriteOnce
size: 4Gi
finalizers:
- kubernetes.io/pvc-protection
grafana.ini:
server:
domain: xxx
root_url: xxx
auth.google:
enabled: true
client_id: xxx
client_secret: xxx
scopes: https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email
auth_url: https://accounts.google.com/o/oauth2/auth
token_url: https://accounts.google.com/o/oauth2/token
allowed_domains: gmail.com
allow_sign_up: false
paths:
data: /var/lib/grafana/data
logs: /var/log/grafana
plugins: /var/lib/grafana/plugins
provisioning: /etc/grafana/provisioning
analytics:
check_for_updates: true
log:
mode: console
grafana_net:
url: https://grafana.net
and after doing an upgrade no PVCs are created, I also tried just this for Grafana and still no luck
grafana:
adminPassword: xxx
enabled: true
persistance:
enabled: true
➜ prometheus git:(master) ✗ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-postgres-postgresql-0 Bound pvc-e099d418-73d4-49ed-8232-e829e418c6b4 8Gi RWO nfs-client 453d
docker-registry Bound pvc-0e726761-fdfd-454d-86a6-36002c37ac3b 30Gi RWO nfs-client 149d
streaming-pvc-streaming-0 Bound pvc-cb97df2f-7e99-47c8-80e0-c215381ee672 20Gi RWO nfs-client 135d
streaming-pvc-streaming-1 Bound pvc-2240cdb5-7ab4-4aee-99c5-45696e4100bb 20Gi RWO nfs-client 135d
streaming-pvc-streaming-2 Bound pvc-7f65bb1f-97ec-4890-9b29-6ba36f470cfe 20Gi RWO nfs-client 135d
can anyone help me with dashboard location ? I have added above values.yaml for persistence, the volume is bound. But when i restart pod, dashboards wont come up ? kamilgregorczyk
Hi @AwateAkshay ,
did you solve your problem? I am having the same issue. I can see my dashboards when I get into grafana container, but they are not present in the grafana itself.
@UrosCvijan exec into grafana pod, you will see grafana.db file which is SQLite DB. Inside you can see your dashboard.
@kamilgregorczyk not "persistance" but "persistence":
grafana:
adminPassword: xxx
enabled: true
persistence:
enabled: true
For some reason it doesn't work for me, got such values:
## helm upgrade --install prometheus prometheus-community/kube-prometheus-stack --values values.yml kube-state-metrics: image: repository: k8s.gcr.io/kube-state-metrics-arm64 tag: v1.9.5 prometheus: prometheusSpec: podMonitorSelectorNilUsesHelmValues: false serviceMonitorSelectorNilUsesHelmValues: false grafana: adminPassword: xxx deploymentStrategy: type: Recreate enabled: true persistance: enabled: true type: pvc storageClassName: default accessModes: - ReadWriteOnce size: 4Gi finalizers: - kubernetes.io/pvc-protection grafana.ini: server: domain: xxx root_url: xxx auth.google: enabled: true client_id: xxx client_secret: xxx scopes: https://www.googleapis.com/auth/userinfo.profile https://www.googleapis.com/auth/userinfo.email auth_url: https://accounts.google.com/o/oauth2/auth token_url: https://accounts.google.com/o/oauth2/token allowed_domains: gmail.com allow_sign_up: false paths: data: /var/lib/grafana/data logs: /var/log/grafana plugins: /var/lib/grafana/plugins provisioning: /etc/grafana/provisioning analytics: check_for_updates: true log: mode: console grafana_net: url: https://grafana.net
and after doing an upgrade no PVCs are created, I also tried just this for Grafana and still no luck
grafana: adminPassword: xxx enabled: true persistance: enabled: true
➜ prometheus git:(master) ✗ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE data-postgres-postgresql-0 Bound pvc-e099d418-73d4-49ed-8232-e829e418c6b4 8Gi RWO nfs-client 453d docker-registry Bound pvc-0e726761-fdfd-454d-86a6-36002c37ac3b 30Gi RWO nfs-client 149d streaming-pvc-streaming-0 Bound pvc-cb97df2f-7e99-47c8-80e0-c215381ee672 20Gi RWO nfs-client 135d streaming-pvc-streaming-1 Bound pvc-2240cdb5-7ab4-4aee-99c5-45696e4100bb 20Gi RWO nfs-client 135d streaming-pvc-streaming-2 Bound pvc-7f65bb1f-97ec-4890-9b29-6ba36f470cfe 20Gi RWO nfs-client 135d
Thanks for posting your code, it helped me debug how to add environmental variables in kube prometheus stack. Now I know that syntax.
I tried the above methods and it got the pv created . But the pod failed to start as an initContainer chownData fails to start even after multiple tries. I followed the following issue 752 and set the initChownData to false.
Now the grafana pod starting running and i am able to access the dashboard but the logs of grafana pod show error="database is locked"
Describe the bug I installed the prometheus-community/kube-prometheus-stack chart. and then I defined panels and alerts on grafana. when I delete the grafana pod - all the data is deleted from grafana - there is no persistency. I wanted to use this solution: https://github.com/prometheus-operator/prometheus-operator/issues/2558#issuecomment-565119967 but to my surprise - no pv or pvc was created by the prometheus-kube-stack chart.
how can I make my Grafana persistent ?
Version of Helm and Kubernetes:
Helm Version:
$ helm version version.BuildInfo{Version:"v3.0.3", GitCommit:"ac925eb7279f4a6955df663a0128044a8a6b7593", GitTreeState:"clean", GoVersion:"go1.13.6"}
Kubernetes Version:
$ kubectl version Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.0", GitCommit:"641856db18352033a0d96dbc99153fa3b27298e5", GitTreeState:"clean", BuildDate:"2019-03-25T15:53:57Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.13-eks-2ba888", GitCommit:"2ba888155c7f8093a1bc06e3336333fbdb27b3da", GitTreeState:"clean", BuildDate:"2020-07-17T18:48:53Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Which chart: prometheus-kube-stack
Which version of the chart: 12.3.0
How to reproduce it (as minimally and precisely as possible): Install prometheus-kube-stack and define a panel in grafana, then delete the grafana pod