VictoriaMetrics / operator

Kubernetes operator for Victoria Metrics
Apache License 2.0
439 stars 146 forks source link

Can I support the persistent storage volumeClaimTemplate of vmstorage as an array type #507

Closed amao-code closed 1 year ago

amao-code commented 2 years ago

I use vmBackup to backup my data, destination is local file system storage: fs:///vm-backup/, add volumeMounts But I don't know how to add volume Because vmstorage.storage.volumeClaimTemplate does not support array type

spec:
    retentionPeriod: "14"
    replicationFactor: 2
    vmstorage:
      image:
        tag: v1.79.0-cluster
      replicaCount: 2
      storageDataPath: "/vm-data"
      extraArgs:
        dedup.minScrapeInterval: 15s
        #rpc.disableCompression: true
      storage:
        volumeClaimTemplate:
          spec:
            resources:
              requests:
                storage: 10Gi
      resources:
        {}
        # limits:
        #   cpu: "1"
      vmBackup:
        acceptEULA: true
        destination: fs:///vm-backup/
        disableMonthly: true
        disableWeekly: true
        extraArgs:
          keepLastDaily: "30"
          keepLastHourly: "72"
          runOnStart: "true"
        volumeMounts:
        - mountPath: /vm-backup
          name: vmbackup-dir

Because I use the storaclass of the local-path, I can only manually create the pvc, and then execute the volume

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels:
    app.kubernetes.io/component: monitoring
    app.kubernetes.io/instance: monitor-victoria-metrics-k8s-stack
    app.kubernetes.io/name: vmbackup
  name: vmbackup
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: local-path
kubectl  get storageclasses.storage.k8s.io
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  20d

the new values

vmcluster:
  enabled: true
  annotations: {}
  # spec for VMSingle crd
  # https://github.com/VictoriaMetrics/operator/blob/master/docs/api.MD#vmclusterspec
  spec:
    retentionPeriod: "14"
    replicationFactor: 2
    vmstorage:
      image:
        tag: v1.79.0-cluster
      replicaCount: 2
      storageDataPath: "/vm-data"
      extraArgs:
        dedup.minScrapeInterval: 15s
        #rpc.disableCompression: true
      storage:
        volumeClaimTemplate:
          spec:
            resources:
              requests:
                storage: 10Gi
      resources:
        {}
        # limits:
        #   cpu: "1"
      volumes:
      - name: vmbackup-dir
        persistentVolumeClaim:
          claimName: vmbackup
      vmBackup:
        acceptEULA: true
        destination: fs:///vm-backup/
        disableMonthly: true
        disableWeekly: true
        extraArgs:
          keepLastDaily: "30"
          keepLastHourly: "72"
          runOnStart: "true"
        volumeMounts:
        - mountPath: /vm-backup
          name: vmbackup-dir

The second pod of vmstorage is always pending because it cannot use volumes on other nodes

kubectl  -n monitoring  get pods
NAME                                                          READY   STATUS    RESTARTS   AGE
...
vmstorage-monitor-victoria-metrics-k8s-stack-0                2/2     Running   1          98m
vmstorage-monitor-victoria-metrics-k8s-stack-1                0/2     Pending   0          80m
kubectl  -n monitoring  describe pods vmstorage-monitor-victoria-metrics-k8s-stack-1
....
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  80m   default-scheduler  0/3 nodes are available: 3 node(s) had volume node affinity conflict.
  Warning  FailedScheduling  80m   default-scheduler  0/3 nodes are available: 3 node(s) had volume node affinity conflict.
f41gh7 commented 2 years ago

I think, that new setting - volumeClaimTemplates could be introduced for vmstorage and vmselect components. It should solve an issue with local mounted folders for statefulset.

f41gh7 commented 2 years ago

Was added since v0.27.0 version. Feel free to re-open if it's not working.

amao-code commented 2 years ago

I downloaded the latest version to execute and encountered the following problem. In the configuration of claimTemplates in the helm value, regardless of whether I set the name of the metadata, it will be set to null in the resources of vmcluster. my helm value

vmcluster:
  enabled: true
  annotations: {}
  spec:
    retentionPeriod: "1"
    replicationFactor: 2
    vmstorage:
      image:
        tag: v1.81.1-cluster
      replicaCount: 2
      storageDataPath: "/vm-data"
      extraArgs:
        dedup.minScrapeInterval: 30s
      claimTemplates:
      - matedata:
          name: vmbackup-dir
        spec:
          resources:
            requests:
              storage: 10Gi        
      storage:
        volumeClaimTemplate:
          spec:
            resources:
              requests:
                storage: 10Gi   

or

vmcluster:
  enabled: true
  annotations: {}
  spec:
    retentionPeriod: "1"
    replicationFactor: 2
    vmstorage:
      image:
        tag: v1.81.1-cluster
      replicaCount: 2
      storageDataPath: "/vm-data"
      extraArgs:
        dedup.minScrapeInterval: 30s
      claimTemplates:
      - spec:
          resources:
            requests:
              storage: 10Gi        
      storage:
        volumeClaimTemplate:
          spec:
            resources:
              requests:
                storage: 10Gi  

The metadata field of the vmcluster resource claimTemplates installed using helm is always {}

helm --kubeconfig /tmp/k8s-kubeconfig upgrade --install monitor -f vm-k8s-values.yaml victoria-metrics-k8s-stack/ -n monitoring
kubectl  -n monitoring  get vmclusters.operator.victoriametrics.com -oyaml
...
    vmstorage:
      claimTemplates:
      - metadata: {}
        spec:
          resources:
            requests:
              storage: 10Gi
        status: {}
      extraArgs:
        dedup.minScrapeInterval: 30s
      image:
        tag: v1.81.1-cluster
      replicaCount: 2
      resources:
        limits:
          cpu: "2"
          memory: 2000Mi
        requests:
          cpu: 500m
          memory: 500Mi
      storage:
        volumeClaimTemplate:
          metadata: {}
          spec:
            resources:
              requests:
                storage: 10Gi
          status: {}
      storageDataPath: /vm-data
...

This will cause a metadata of volumeClaimTemplates in the StatefulSet of vmstorage to have no name

kubectl  -n monitoring  get sts vmstorage-monitor-victoria-metrics-k8s-stack -o yaml
...
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: vmstorage-db
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      volumeMode: Filesystem
    status:
      phase: Pending
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
    spec:
      resources:
        requests:
          storage: 10Gi
      volumeMode: Filesystem
    status:
      phase: Pending
...

vmstorage create pvc error

kubectl  -n monitoring  describe sts vmstorage-monitor-victoria-metrics-k8s-stack
...
Events:
  Type     Reason        Age                   From                    Message
  ----     ------        ----                  ----                    -------
  Warning  FailedCreate  5m39s (x19 over 27m)  statefulset-controller  create Claim -vmstorage-monitor-victoria-metrics-k8s-stack-0 for Pod vmstorage-monitor-victoria-metrics-k8s-stack-0 in StatefulSet vmstorage-monitor-victoria-metrics-k8s-stack failed error: PersistentVolumeClaim "-vmstorage-monitor-victoria-metrics-k8s-stack-0" is invalid: [metadata.name: Invalid value: "-vmstorage-monitor-victoria-metrics-k8s-stack-0": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), spec.accessModes: Required value: at least 1 access mode is required]
  Warning  FailedCreate  5m39s (x19 over 27m)  statefulset-controller  create Pod vmstorage-monitor-victoria-metrics-k8s-stack-0 in StatefulSet vmstorage-monitor-victoria-metrics-k8s-stack failed error: failed to create PVC -vmstorage-monitor-victoria-metrics-k8s-stack-0: PersistentVolumeClaim "-vmstorage-monitor-victoria-metrics-k8s-stack-0" is invalid: [metadata.name: Invalid value: "-vmstorage-monitor-victoria-metrics-k8s-stack-0": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), spec.accessModes: Required value: at least 1 access mode is required]
...

my kubernetes version and crd version

## chart version
NAME                NAMESPACE   REVISION    UPDATED                                 STATUS      CHART                               APP VERSION
monitor             monitoring  1           2022-09-09 14:22:05.626494 +0800 CST    deployed    victoria-metrics-k8s-stack-0.12.1   1.81.1
#### kubernetes version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.10", GitCommit:"a7a32748b5c60445c4c7ee904caf01b91f2dbb71", GitTreeState:"clean", BuildDate:"2022-02-16T11:18:16Z", GoVersion:"go1.16.14", Compiler:"gc", Platform:"linux/amd64"}
##  CRDS 
kubectl  explain vmcluster.spec.vmstorage.claimTemplates
KIND:     VMCluster
VERSION:  operator.victoriametrics.com/v1beta1

RESOURCE: claimTemplates <[]Object>

DESCRIPTION:
     ClaimTemplates allows adding additional VolumeClaimTemplates for
     StatefulSet

     PersistentVolumeClaim is a user's request for and claim to a persistent
     volume

FIELDS:
   apiVersion   <string>
     APIVersion defines the versioned schema of this representation of an
     object. Servers should convert recognized schemas to the latest internal
     value, and may reject unrecognized values. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources

   kind <string>
     Kind is a string value representing the REST resource this object
     represents. Servers may infer this from the endpoint the client submits
     requests to. Cannot be updated. In CamelCase. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds

   metadata <map[string]>
     Standard object's metadata. More info:
     https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata

   spec <Object>
     Spec defines the desired characteristics of a volume requested by a pod
     author. More info:
     https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims

   status   <Object>
     Status represents the current information/status of a persistent volume
     claim. Read-only. More info:
     https://kubernetes.io/docs/concepts/storage/persistent-volumes#persistentvolumeclaims
f41gh7 commented 2 years ago

Partly fixed at v0.28.5 version. It's supported, but operator doesn't update statefulset if something changed at this templates and doesn't resize pvc out of box.

f41gh7 commented 1 year ago

There is a function wasCreatedSTS, where persistent volume changes tracked: needRecreateOnStorageChange.

https://github.com/VictoriaMetrics/operator/blob/master/controllers/factory/k8stools/expansion.go#L86

It would be great to add a check, if something was changed at any of Spec.VolumeClaimTemplates.

Probably, simple marshaling and bytes compare should be enough. Also, it would be great to add a tests for this case cc @Amper @Haleygo

Haleygo commented 1 year ago

It would be great to add a check, if something was changed at any of Spec.VolumeClaimTemplates.

Okay, I saw there is a length check now, will add the whole check. https://github.com/VictoriaMetrics/operator/blob/622cf84b164584525289cc5c1f180511852c73cb/controllers/factory/k8stools/expansion.go#L118-L122

Amper commented 1 year ago

Done at v0.36.0 release