carlosedp / cluster-monitoring

Cluster monitoring stack for clusters based on Prometheus Operator
MIT License
740 stars 201 forks source link

Prometheus PVC configuration not created with 'make' #159

Open RovoMe opened 2 years ago

RovoMe commented 2 years ago

I've configured PV for both Prometheus and Grafana:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-prometheus
  labels:
    type: local
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/media/config/prometheus/storage/"
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-grafana
  labels:
    type: local
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: "/media/config/grafana/storage"

where /media/config is a NFS mounted directory. On performing a make vendor followed by a make command I noticed that during that process a manifests/grafana-storage.yaml is created with the following content:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: grafana-storage
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: ""
  volumeName: pv-grafana

I see that after a make deploy the PV for Grafana is in state Bound

Name:            pv-grafana
Labels:          type=local
Annotations:     pv.kubernetes.io/bound-by-controller: yes
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    
Status:          Bound
Claim:           monitoring/grafana-storage
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        20Gi
Node Affinity:   <none>
Message:         
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /media/config/grafana/storage
    HostPathType:  
Events:            <none>

However there is no claim for the Prometheus PV made and as the configuration in vars.jsonnet is set to

{
  ...
  // Persistent volume configuration
  enablePersistence: {
    // Setting these to false, defaults to emptyDirs.
    prometheus: true,
    grafana: true,
    // If using a pre-created PV, fill in the names below. If blank, they will use the default StorageClass
    prometheusPV: 'pv-prometheus',
    grafanaPV: 'pv-grafana',
    // If required to use a specific storageClass, keep the PV names above blank and fill the storageClass name below.
    storageClass: '',
    // Define the PV sizes below
    prometheusSizePV: '2Gi',
    grafanaSizePV: '20Gi',
  },
  ...
}

the prometheus-k8s-0 POD is hanging and waiting for a PV to be used.

On looking up the description of the prometheus-k8s-0 POD it was more or less obvious how the claim should be named. I therefore created a prometheus-storage.yaml file in manifests and specified the content below:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: prometheus-k8s-db-prometheus-k8s-0
  namespace: monitoring
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: ""
  volumeName: pv-prometheus

After provisioning the claim the POD is now working as intended. Not sure why this storage configuration isn't automatically created as the Grafana one is. On adding this configuration to the manifests directory, it is also automatically provisioned using make deploy.

Troubleshooting

  1. Which kind of Kubernetes cluster are you using? (Kubernetes, K3s, etc)
    $ k3s --version
    k3s version v1.23.6+k3s1 (418c3fa8)
    go version go1.17.5
  2. Are all pods in "Running" state? If any is in CrashLoopback or Error, check it's logs. prometheus-k8s-0 is hanging and waiting for a PVC request to claim the storage it needs
  3. You cluster already works with other applications that have HTTP/HTTPS? If not, first deploy an example NGINX and test it's access thru the created URL. Grafana was accessible, though no data was provided from Prometheus as it wasn't able to initialize properly
  4. If you enabled persistence, do your cluster already provides persistent storage (PVs) to other applications? yes, configuration pasted above
  5. Does it provides dynamic storage thru StorageClass? no
  6. If you deployed the monitoring stack and some targets are not available or showing no metrics in Grafana, make sure you don't have IPTables rules or use a firewall on your nodes before deploying Kubernetes. No active UFW for now and NFS is also working fine

Customizations

  1. Did you customize vars.jsonnet? Put the contents below: Relevant portion was already posted above
  2. Did you change any other file? Put the contents below: I had to add the prometheus-storage.yaml file with above-mentioned content to get it working