samba-in-kubernetes / samba-operator

An operator for a Samba as a service on PVCs in kubernetes
Apache License 2.0
103 stars 24 forks source link

NT_STATUS_LOGON_FAILURE when rollout restart a Samba Statefulset with CTDB feature #262

Open FTS152 opened 1 year ago

FTS152 commented 1 year ago

Hi, I am trying to deploy a Samba service with CTDB support on a bare-metal k3s cluster, here is my environment:

NAME        STATUS   ROLES                       AGE   VERSION
k8s-test2   Ready    control-plane,etcd,master   20d   v1.24.4+k3s1
k8s-test3   Ready    control-plane,etcd,master   20d   v1.24.4+k3s1
k8s-test4   Ready    control-plane,etcd,master   20d   v1.24.4+k3s1

I use rook-ceph as my backend storage and metalLB as my load balancer:

NAME                   PROVISIONER                     RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path           Delete          WaitForFirstConsumer   false                  20d
rook-cephfs            rook-ceph.cephfs.csi.ceph.com   Delete          Immediate              true                   20d

NAMESPACE               NAME              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
samba-operator-system   smbshare1-pvc     Bound    pvc-738b3752-952d-47b5-a25a-8504ab36c744   100Gi      RWX            rook-cephfs    5d22h
samba-operator-system   smbshare1-state   Bound    pvc-649423f5-60cf-4505-baf1-2a5bdc5476ae   10Gi       RWX            rook-cephfs    10m

I deploy a Samba statefulset with minClusterSize=3 and so far so good:

NAME                                                 READY   STATUS    RESTARTS        AGE
samba-operator-controller-manager-677c5f7c47-6m4hv   2/2     Running   9 (4d22h ago)   20d
smbshare1-0                                          4/4     Running   1 (13m ago)     13m
smbshare1-1                                          4/4     Running   0               13m
smbshare1-2                                          4/4     Running   1 (12m ago)     12m

NAME                                                TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)         AGE
samba-operator-controller-manager-metrics-service   ClusterIP      10.43.12.21     <none>         8443/TCP        20d
smbshare1                                           LoadBalancer   10.43.27.1      10.20.91.127   445:31108/TCP   14m

However when I restart samba by kubectl rollout restart without any changes, I cannot login to samba server anymore after rolling update is done:

root@k8s-test4:~/samba-service# smbclient '//10.20.91.127/smbshare1' -U smbuser1 -p 445
Enter WORKGROUP\smbuser1's password:
session setup failed: NT_STATUS_LOGON_FAILURE

What I figured out are that account_policy.tdb, group_mapping.tdb and registry.tdb disappeared in /var/lib/samba in the pod, and some error messages in log file of samba container:

2022-10-11 03:47:44,961: INFO: Enabling ctdb in samba config file
Failed to initialize the registry: WERR_FILE_NOT_FOUND
Can't load /etc/samba/smb.conf - run testparm to debug it
smbd - Failed to load config file!

Also samba share information is lost in registry in the pod. Any ideas? Thanks.

phlogistonjohn commented 1 year ago

Thanks for reporting this! I personally haven't any first hand experience with kubectl rollout restart and it's not part of our testing suite. So I suspect that it a large portion of the cause of the problem. The approach it takes to restarting the pods may be triggering an issue with the order of operations used to bring up the pods or perhaps there's an additional health check we need to prevent it from acting upon pods.

That's all speculation so far. First question - can we assume things were working OK prior to running the restart command? Next, the persistent tdb files should be stored on a PV. Can you help us confirm that the pods are mounting a persistent volume for the relevant paths under /var/lib - a full kubectl get -o yaml of the pods should suffice there.

Finally, it would be helpful to see some of logs from the init containers from both before and after kubectl rollout restart is run.

Thanks!

FTS152 commented 1 year ago

Thanks for the reply! I'm still researching what is causing the tdb files to disappear after rollout restart. It seems that this problem only happens when restarting a Samba server with CTDB feature enabled. Maybe it is the difference in the behavior of rollout restart in deployment and statefulset that causes this problem. Samba server works fine before kubectl rollout restart, one can connect to the server and read/write data from/to the share normally:

root@fts152-QV99:~# smbclient '//10.20.91.127/smbshare1' -U smbuser1 -p 445
Enter WORKGROUP\smbuser1's password:
Try "help" to get a list of possible commands.
smb: \> ls
  .                                   D        0  Tue Oct 11 16:03:32 2022
  ..                                  D        0  Thu Oct 13 10:07:11 2022

        104857600 blocks of size 1024. 104857600 blocks available
smb: \> mkdir test
smb: \> cd test
smb: \test\> put /root/myfile myfile
putting file /root/myfile as \test\myfile (30684.3 kb/s) (average 30684.3 kb/s)
smb: \test\> ls
  .                                   D        0  Thu Oct 13 10:19:15 2022
  ..                                  D        0  Thu Oct 13 10:19:02 2022
  myfile                              A 1073741824  Thu Oct 13 10:19:49 2022

        104857600 blocks of size 1024. 104857600 blocks available
smb: \test\>

I checked samba with kubectl describe and found that /var/lib/samba mounted from samba-state-dir, but samba-state-dir is not a PV. Could this be the problem? :

   samba:
    Image:      quay.io/samba.org/samba-server:latest
    Port:       445/TCP
    Host Port:  0/TCP
    Command:
      samba-container
    Args:
      run
      smbd
      --setup=users
      --setup=smb_ctdb
    Liveness:   tcp-socket :445 delay=0s timeout=1s period=10s #success=1 #failure=3
    Readiness:  tcp-socket :445 delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:
      SAMBA_CONTAINER_ID:   smbshare1
      SAMBACC_CONFIG:       /etc/container-config/config.json:/etc/container-users/users.json
      SAMBA_POD_NAME:        (v1:metadata.name)
      SAMBA_POD_NAMESPACE:   (v1:metadata.namespace)
    Mounts:
(...)
      /var/lib/samba from samba-state-dir (rw)
      /var/run/ctdb from ctdb-sockets (rw)
(...)
  Volumes:
   smbshare1-pvc-smb:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  smbshare1-pvc
    ReadOnly:   false
   samba-container-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      smbshare1
    Optional:  false
   samba-state-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>

the full YAML file for samba Statefulset is as below:

root@k8s-test4:~/rook# kubectl get statefulset smbshare1 -o yaml -n samba-operator-system
apiVersion: apps/v1
kind: StatefulSet
metadata:
  creationTimestamp: "2022-10-13T02:04:07Z"
  generation: 1
  labels:
    app: samba
    app.kubernetes.io/component: smbd
    app.kubernetes.io/instance: samba-smbshare1
    app.kubernetes.io/managed-by: samba-operator
    app.kubernetes.io/name: samba
    app.kubernetes.io/part-of: samba
    samba-operator.samba.org/service: smbshare1
  name: smbshare1
  namespace: samba-operator-system
  ownerReferences:
  - apiVersion: samba-operator.samba.org/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: SmbShare
    name: smbshare1
    uid: 94a42780-a8f8-4ae1-b9b1-debac6bd5520
  resourceVersion: "11256600"
  uid: f3d02b0c-3cfa-47d6-bb19-cd2ad946f251
spec:
  podManagementPolicy: OrderedReady
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: samba
      app.kubernetes.io/component: smbd
      app.kubernetes.io/instance: samba-smbshare1
      app.kubernetes.io/managed-by: samba-operator
      app.kubernetes.io/name: samba
      app.kubernetes.io/part-of: samba
      samba-operator.samba.org/service: smbshare1
  serviceName: ""
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: samba
        kubectl.kubernetes.io/default-logs-container: samba
      creationTimestamp: null
      labels:
        app: samba
        app.kubernetes.io/component: smbd
        app.kubernetes.io/instance: samba-smbshare1
        app.kubernetes.io/managed-by: samba-operator
        app.kubernetes.io/name: samba
        app.kubernetes.io/part-of: samba
        samba-operator.samba.org/service: smbshare1
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: samba-operator.samba.org/service
                operator: In
                values:
                - smbshare1
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - run
        - ctdbd
        - --setup=smb_ctdb
        - --setup=ctdb_config
        - --setup=ctdb_etc
        - --setup=ctdb_nodes
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ctdb
        readinessProbe:
          exec:
            command:
            - samba-container
            - check
            - ctdb-nodestatus
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /etc/ctdb
          name: ctdb-config
        - mountPath: /var/lib/ctdb/persistent
          name: ctdb-persistent
        - mountPath: /var/lib/ctdb/volatile
          name: ctdb-volatile
        - mountPath: /var/run/ctdb
          name: ctdb-sockets
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
      - args:
        - ctdb-manage-nodes
        - --hostname=$(HOSTNAME)
        - --take-node-number-from-hostname=after-last-dash
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ctdb-manage-nodes
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /etc/ctdb
          name: ctdb-config
        - mountPath: /var/run/ctdb
          name: ctdb-sockets
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
      - args:
        - run
        - smbd
        - --setup=users
        - --setup=smb_ctdb
        command:
        - samba-container
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1
        name: samba
        ports:
        - containerPort: 445
          name: smb
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /mnt/94a42780-a8f8-4ae1-b9b1-debac6bd5520
          name: smbshare1-pvc-smb
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /etc/ctdb
          name: ctdb-config
        - mountPath: /var/lib/ctdb/persistent
          name: ctdb-persistent
        - mountPath: /var/lib/ctdb/volatile
          name: ctdb-volatile
        - mountPath: /var/run/ctdb
          name: ctdb-sockets
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
        - mountPath: /etc/container-users
          name: users-config
      - args:
        - update-config
        - --watch
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: Always
        name: watch-update-config
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /mnt/94a42780-a8f8-4ae1-b9b1-debac6bd5520
          name: smbshare1-pvc-smb
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /etc/ctdb
          name: ctdb-config
        - mountPath: /var/lib/ctdb/persistent
          name: ctdb-persistent
        - mountPath: /var/lib/ctdb/volatile
          name: ctdb-volatile
        - mountPath: /var/run/ctdb
          name: ctdb-sockets
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
        - mountPath: /etc/container-users
          name: users-config
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - --skip-if-file=/var/lib/ctdb/shared/nodes
        - init
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: init
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
      - args:
        - ensure-share-paths
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ensure-share-paths
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /mnt/94a42780-a8f8-4ae1-b9b1-debac6bd5520
          name: smbshare1-pvc-smb
      - args:
        - ctdb-migrate
        - --dest-dir=/var/lib/ctdb/persistent
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ctdb-migrate
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
        - mountPath: /etc/ctdb
          name: ctdb-config
        - mountPath: /var/lib/ctdb/persistent
          name: ctdb-persistent
      - args:
        - ctdb-set-node
        - --hostname=$(HOSTNAME)
        - --take-node-number-from-hostname=after-last-dash
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ctdb-set-node
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
        - mountPath: /etc/ctdb
          name: ctdb-config
      - args:
        - ctdb-must-have-node
        - --hostname=$(HOSTNAME)
        - --take-node-number-from-hostname=after-last-dash
        env:
        - name: SAMBA_CONTAINER_ID
          value: smbshare1
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json:/etc/container-users/users.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBACC_CTDB
          value: ctdb-is-experimental
        image: quay.io/samba.org/samba-server:latest
        imagePullPolicy: IfNotPresent
        name: ctdb-must-have-node
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /etc/container-users
          name: users-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /var/lib/ctdb/shared
          name: smbshare1-state-ctdb
        - mountPath: /etc/ctdb
          name: ctdb-config
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      shareProcessNamespace: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: smbshare1-pvc-smb
        persistentVolumeClaim:
          claimName: smbshare1-pvc
      - configMap:
          defaultMode: 420
          name: smbshare1
        name: samba-container-config
      - emptyDir: {}
        name: samba-state-dir
      - emptyDir:
          medium: Memory
        name: ctdb-config
      - emptyDir:
          medium: Memory
        name: ctdb-persistent
      - emptyDir:
          medium: Memory
        name: ctdb-volatile
      - emptyDir:
          medium: Memory
        name: ctdb-sockets
      - name: smbshare1-state-ctdb
        persistentVolumeClaim:
          claimName: smbshare1-state
      - name: users-config
        secret:
          defaultMode: 420
          items:
          - key: demousers
            path: users.json
          secretName: users1
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
status:
  availableReplicas: 3
  collisionCount: 0
  currentReplicas: 3
  currentRevision: smbshare1-689c69d6f4
  observedGeneration: 1
  readyReplicas: 3
  replicas: 3
  updateRevision: smbshare1-689c69d6f4
  updatedReplicas: 3

Logs of init containers before kubectl rollout restart:

root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c init -n samba-operator-system
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-migrate -n samba-operator-system
2022-10-13 03:42:48,590: INFO: Checking for /var/lib/samba/account_policy.tdb
2022-10-13 03:42:48,590: INFO: Converting /var/lib/samba/account_policy.tdb to /var/lib/ctdb/persistent/account_policy.tdb.0 ...
2022-10-13 03:42:48,592: INFO: Checking for /var/lib/samba/private/account_policy.tdb
2022-10-13 03:42:48,592: INFO: Checking for /var/lib/samba/group_mapping.tdb
2022-10-13 03:42:48,592: INFO: Converting /var/lib/samba/group_mapping.tdb to /var/lib/ctdb/persistent/group_mapping.tdb.0 ...
2022-10-13 03:42:48,593: INFO: Checking for /var/lib/samba/private/group_mapping.tdb
2022-10-13 03:42:48,593: INFO: Checking for /var/lib/samba/passdb.tdb
2022-10-13 03:42:48,593: INFO: Checking for /var/lib/samba/private/passdb.tdb
2022-10-13 03:42:48,593: INFO: Converting /var/lib/samba/private/passdb.tdb to /var/lib/ctdb/persistent/passdb.tdb.0 ...
2022-10-13 03:42:48,595: INFO: Checking for /var/lib/samba/registry.tdb
2022-10-13 03:42:48,595: INFO: Converting /var/lib/samba/registry.tdb to /var/lib/ctdb/persistent/registry.tdb.0 ...
2022-10-13 03:42:48,596: INFO: Checking for /var/lib/samba/private/registry.tdb
2022-10-13 03:42:48,597: INFO: Checking for /var/lib/samba/secrets.tdb
2022-10-13 03:42:48,597: INFO: Checking for /var/lib/samba/private/secrets.tdb
2022-10-13 03:42:48,597: INFO: Converting /var/lib/samba/private/secrets.tdb to /var/lib/ctdb/persistent/secrets.tdb.0 ...
2022-10-13 03:42:48,598: INFO: Checking for /var/lib/samba/share_info.td
2022-10-13 03:42:48,598: INFO: Checking for /var/lib/samba/private/share_info.td
2022-10-13 03:42:48,598: INFO: Checking for /var/lib/samba/winbindd_idmap.tdb
2022-10-13 03:42:48,598: INFO: Checking for /var/lib/samba/private/winbindd_idmap.tdb
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-set-node -n samba-operator-system
2022-10-13 03:42:49,644: INFO: Determined address for smbshare1-0: 10.42.1.173
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-must-have-node -n samba-operator-system
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ensure-share-paths -n samba-operator-system
2022-10-13 03:42:47,578: INFO: Ensuring share path: /mnt/cfa64779-58dc-4278-9883-2b6477a3f56a
2022-10-13 03:42:47,578: INFO: Updating permissions if needed: /mnt/cfa64779-58dc-4278-9883-2b6477a3f56a
2022-10-13 03:42:47,578: INFO: Using initializing posix permissions handler

Logs of init containers after kubectl rollout restart:

root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c init -n samba-operator-system
Action skipped: skip-if-file: /var/lib/ctdb/shared/nodes exists
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-migrate -n samba-operator-system
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/account_policy.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/private/account_policy.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/group_mapping.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/private/group_mapping.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/passdb.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/private/passdb.tdb
2022-10-13 03:57:42,866: INFO: Checking for /var/lib/samba/registry.tdb
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/private/registry.tdb
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/secrets.tdb
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/private/secrets.tdb
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/share_info.td
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/private/share_info.td
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/winbindd_idmap.tdb
2022-10-13 03:57:42,867: INFO: Checking for /var/lib/samba/private/winbindd_idmap.tdb
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-set-node -n samba-operator-system
2022-10-13 03:57:43,809: INFO: Determined address for smbshare1-0: 10.42.1.174
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ctdb-must-have-node -n samba-operator-system
2022-10-13 03:57:49,162: INFO: node not yet ready
root@k8s-test4:~/samba-service# kubectl logs smbshare1-0 -c ensure-share-paths -n samba-operator-system
2022-10-13 03:57:41,871: INFO: Ensuring share path: /mnt/cfa64779-58dc-4278-9883-2b6477a3f56a
2022-10-13 03:57:41,871: INFO: Updating permissions if needed: /mnt/cfa64779-58dc-4278-9883-2b6477a3f56a
2022-10-13 03:57:41,871: INFO: Using initializing posix permissions handler
FTS152 commented 1 year ago

UPDATES: Looks like the user information is mounted successfully in the pod:

[root@smbshare1-1 /]# cat /etc/container-users/users.json
{
  "samba-container-config": "v0",
  "users": {
    "all_entries": [
      {
        "name": "smbuser1",
        "password": "samba"
      },
      {
        "name": "smbuser2",
        "password": "samba"
      }
    ]
  }
}

And when I execute samba-container import and samba-container import-users in every pods after rollout restart, I can login to Samba server normally. Is this expected behavior or I just misunderstood something?

phlogistonjohn commented 1 year ago

And when I execute samba-container import and samba-container import-users in every pods after rollout restart, I can login to Samba server normally. Is this expected behavior or I just misunderstood something?

Very interesting! No, this is not intended behavior. I would consider this a bug.

I think what might be happening is that when all the pods restart they are not importing all the configuration data - because some data is expected to persist in the samba db files, but other data like the users/groups stored in /etc are kept in system files and these files are "reset" every time the container is started. The code may be failing to distinguish between configuration data that needs to be recreated on every container vs. configuration data that persists and then simply doesn't bother trying to recreate the former type of configuration data.

In addition, when we test with AD enabled we might miss this because in the AD case the users/groups data is not stored in the containers. One of my teammates @anoopcs9 suggested that we should incorporate the rolling restart into one of test cases. I agree. We should try to fix this issue and create an integration test to verify the fix!

phlogistonjohn commented 1 year ago

@anoopcs9 this issue may also be more in sambacc if we tightly couple the creation of /etc/passwd and /etc/group with the population of the samba passdb. We may need to make sure that the passdb can be populated independently of the /etc/{passwd,group} files and maybe have an init container that creates the latter every time the pod is started? At the very least it's a good first place to look if we can reproduce the error.

FTS152 commented 11 months ago

Hi, is there any news regarding this issue?

phlogistonjohn commented 11 months ago

Unfortunately, no. I don't think anyone on the team has had the time to work on the CTDB support in a while. It's still on our radar though!

FTS152 commented 5 months ago

@phlogistonjohn I think I figured out where the problem is. In Samba Statefulset, there is a init container to do some initialization works:

  initContainers:
  - args:
    - init
    - --skip-if-file=/var/lib/ctdb/shared/nodes

This command includes importing samba config and user data to registry. When performing a rollout restart, the node file is still in state pvc and init container "wrongly" thinks that the initialization process has been done.

phlogistonjohn commented 5 months ago

Ohh, interesting. That could be it. Can you see if there are other unique states we could key off of to tell when the rollout is occuring vs. a typical bringup?

FTS152 commented 5 months ago

@phlogistonjohn To my understanding, there isn't a specific field or status condition in k8s api that indicates whether a resource is undergoing a rollout restart. If this method is not feasible, an alternative (but not convenient) way may be manually adding custom labels/annotations to the Statefulset before rollout.