sorintlab / stolon

PostgreSQL cloud native High Availability and more.
https://talk.stolon.io
Apache License 2.0
4.66k stars 447 forks source link

Add flag to avoid lock on data dir #814

Open alessandro-sorint opened 3 years ago

alessandro-sorint commented 3 years ago

What would you like to be added: We would like to add a flag to avoid the lock on data dir because the syscall F_SETLK doesnt't work.

Why is this needed: We have tryed to deploy a pod with stolon keeper so defined:

# apiVersion: apps/v1alpha1
# kind: PetSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: stolon-keeper
  namespace: default
spec:
  serviceName: "stolon-keeper"
  replicas: 3
  selector:
    matchLabels:
      component: stolon-keeper
      stolon-cluster: stolon-cluster-default
  template:
    metadata:
      labels:
        component: stolon-keeper
        stolon-cluster: stolon-cluster-default
      annotations:
        pod.alpha.kubernetes.io/initialized: "true"
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
    spec:
      terminationGracePeriodSeconds: 10
      containers:
        - name: stolon-keeper
          image: sorintlab/stolon:v0.16.0-pg12
          command:
            - "/bin/bash"
            - "-ec"
            - |
              # Generate our keeper uid using the pod index
              IFS='-' read -ra ADDR <<< "$(hostname)"
              export STKEEPER_UID="keeper${ADDR[-1]}"
              export POD_IP=$(hostname -i)
              export STKEEPER_PG_LISTEN_ADDRESS=$POD_IP
              export STOLON_DATA=/stolon-data
              chown stolon:stolon $STOLON_DATA
              exec gosu stolon stolon-keeper --data-dir $STOLON_DATA
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: STKEEPER_CLUSTER_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.labels['stolon-cluster']
            - name: STKEEPER_STORE_BACKEND
              value: "kubernetes"
            - name: STKEEPER_KUBE_RESOURCE_KIND
              value: "configmap"
            - name: STKEEPER_PG_REPL_USERNAME
              value: "repluser"
              # Or use a password file like in the below supersuser password
            - name: STKEEPER_PG_REPL_PASSWORD
              value: "replpassword"
            - name: STKEEPER_PG_SU_USERNAME
              value: "stolon"
            - name: STKEEPER_PG_SU_PASSWORDFILE
              value: "/etc/secrets/stolon/password"
            - name: STKEEPER_METRICS_LISTEN_ADDRESS
              value: "0.0.0.0:8080"
            # Uncomment this to enable debug logs
            #- name: STKEEPER_DEBUG
            #  value: `"true"`
          ports:
            - containerPort: 5432
            - containerPort: 8080
          volumeMounts:
            - mountPath: /stolon-data
              name: stolon-persistent-storage
            - mountPath: /etc/secrets/stolon
              name: stolon
      volumes:
        - name: stolon
          secret:
            secretName: stolon
  # Define your own volumeClaimTemplate. This example uses dynamic PV provisioning with a storage class named "standard" (so it will works by default with minikube)
  # In production you should use your own defined storage-class and configure your persistent volumes (statically or dynamically using a provisioner, see related k8s doc).
  volumeClaimTemplates:
  - metadata:
      name: stolon-persistent-storage
    spec:
      storageClassName: managed-nfs-storage
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 50Gi

We had this log errors:

$ kubectl logs stolon-keeper-2 
2020-12-24T14:36:39.201Z        WARN    cmd/keeper.go:182       password file permissions are too open. This file should only be readable to the user executing stolon! Continuing...    {"file": "/etc/secrets/stolon/password", "mode": "01000000777"}
2020-12-24T14:36:39.210Z        FATAL   cmd/keeper.go:2036      cannot take exclusive lock on data dir "/stolon-data/lock": input/output error
alessandro-sorint commented 3 years ago

We are working to an implementation: https://github.com/sorintlab/stolon/pull/813

sgotti commented 3 years ago

@alessandro-sorint you should investigate why your nfs server/client doesn't support locking. NFSv4 should have it enabled by default.

We could add a flag to disable locking but it should be marked as dangerous in the description since it's a workaround on underlying storage issues and it'll cause data corruptions if two keepers are concurrently running with the same data dir.

alessandro-sorint commented 3 years ago

Tanks @sgotti we use in our configuration volumes different for evry keeper, so I think it's not a problem to remove the file locking

sgotti commented 3 years ago

Tanks @sgotti we use in our configuration volumes different for evry keeper, so I think it's not a problem to remove the file locking

There's always the probability that two keepers will run on the same data dir for multiple reasons (wrong configuration, user error etc...). The real solution is to fix the filesystem locking issues but if it's not possible I'm ok to add an option but with a big warning like explained above.

alessandro-sorint commented 3 years ago

We did it! https://github.com/sorintlab/stolon/pull/817

Is it enough clear the message of warning? Should we add a log of warning? Thanks

hezise commented 3 years ago

I had the same error when one of the nodes was disconnected and reconnected.