samba-in-kubernetes / samba-operator

An operator for a Samba as a service on PVCs in kubernetes
Apache License 2.0
108 stars 24 forks source link

AD share is not aible to fetch own SID #220

Open turricum opened 2 years ago

turricum commented 2 years ago

I installed the Samba Operator 0.2 on an Openshift 4.8 Barebone Cluster. I created some AD shares.

1) the created share export pod is starting 2) in AD (Samba 4.12.2) the computer object is created 3) the pod has a CrashLoopBackOff, the wb container cannot start:

winbindd version 4.15.7 started.
Copyright Andrew Tridgell and the Samba Team 1992-2021
initialize_winbindd_cache: clearing cache and re-creating with version number 2
Could not fetch our SID - did we join?
unable to initialize domain list

yamls:

`apiVersion: v1
kind: Secret
metadata:
  name: join1
  namespace: samba-shares
type: Opaque
stringData:
  join.json: |
    {"username": "samba-container-join", "password": ":-)"}
---
apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbSecurityConfig
metadata:
  name: addomain
  namespace: samba-shares
spec:
  mode: active-directory
  realm: ad.domain.com
  joinSources:
  - userJoin:
      secret: join1
      key: join.json
---
apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbSecurityConfig
metadata:
  name: addomain
  namespace: samba-shares
spec:
  mode: active-directory
  realm: ad.domain.com
  joinSources:
  - userJoin:
      secret: join1
      key: join.json
apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbCommonConfig
metadata:
  name: freigabe
  namespace: samba-shares
spec:
  network:
    publish: external
---
apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbShare
metadata:
  name: testshare
  namespace: samba-shares
spec:
  commonConfig: freigabe
  securityConfig: addomain
  readOnly: false
  storage:
    pvc:
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi

samba-tool at the AD Server shows that the entry is created`

# samba-tool computer show TESTSHARE 
dn: CN=TESTSHARE,OU=Containers,OU=Domain Computers,DC=ad,DC=domain,DC=com
objectClass: top
objectClass: person
objectClass: organizationalPerson
objectClass: user
objectClass: computer
cn: TESTSHARE
instanceType: 4
whenCreated: 20220615103058.0Z
uSNCreated: 144306
name: TESTSHARE
objectGUID: 3adabc17-a938-47fa-843c-1e864b86e19e
badPwdCount: 0
codePage: 0
countryCode: 0
badPasswordTime: 0
lastLogoff: 0
primaryGroupID: 515
objectSid: S-1-5-21-2358220382-4025805735-3930986455-1375
accountExpires: 9223372036854775807
sAMAccountName: TESTSHARE$
sAMAccountType: 805306369
servicePrincipalName: HOST/TESTSHARE.ad.domain.com
servicePrincipalName: RestrictedKrbHost/TESTSHARE.ad.domain.com
servicePrincipalName: HOST/TESTSHARE
servicePrincipalName: RestrictedKrbHost/TESTSHARE
objectCategory: CN=Computer,CN=Schema,CN=Configuration,DC=ad,DC=domain,DC=com
isCriticalSystemObject: FALSE
dNSHostName: testshare.ad.domain.com
lastLogonTimestamp: 132997626582395210
msDS-SupportedEncryptionTypes: 31
pwdLastSet: 132997630161230470
userAccountControl: 4096
lastLogon: 132997630162023640
logonCount: 6
whenChanged: 20220615104727.0Z
uSNChanged: 144314
distinguishedName: CN=TESTSHARE,OU=Containers,OU=Domain Computers,DC=ad,DC=domain,DC=com

3) debug the pod / wb container

# oc get pods
NAME                                   READY   STATUS             RESTARTS   AGE
testshare-testshare-5986c96565-92gx9   1/2     CrashLoopBackOff   12         41m
# oc get logs testshare-5986c96565-92gx9 -c wb
winbindd version 4.15.7 started.
Copyright Andrew Tridgell and the Samba Team 1992-2021
initialize_winbindd_cache: clearing cache and re-creating with version number 2
Could not fetch our SID - did we join?
unable to initialize domain list
sh-5.1# samba-container 
[global]
    disable spoolss = yes
    fileid:algorithm = fsid
    load printers = no
    printcap name = /dev/null
    printing = bsd
    smb ports = 445
    vfs objects = fileid
    idmap config * : backend = autorid
    idmap config * : range = 2000-9999999
    realm = AD.DOMAIN.COM
    security = ads
    workgroup = AD
    netbios name = testshare

[testshare]
    path = /mnt/75067755-fe82-4f3c-841f-1ad7df34b5c8
    read only = no

and the same wenn I start debugging ...

[root@testshare-5986c96565-92gx9-debug /]# samba-container run winbindd
winbindd version 4.15.7 started.
Copyright Andrew Tridgell and the Samba Team 1992-2021
initialize_winbindd_cache: clearing cache and re-creating with version number 2
Could not fetch our SID - did we join?
unable to initialize domain list

so, there is a SID, AD says welcome and the Pod could not fetch the own SID.

phlogistonjohn commented 2 years ago

Can you confirm that the 'must-join' init container executed successfully? If so, what does net ads testjoin -d10 run w/in the smbd pod report?

ibotty commented 2 years ago

I am working on the same cluster. Yes, the init container must-join runs successfully.

This container's log is

Password for [AD\samba-container-join]:dos charset 'CP850' unavailable - using ASCII
dos charset 'CP850' unavailable - using ASCII

Using short domain name -- REDACTED
Joined 'TESTSHARE' to dns domain 'ad.redacted.tld'
2022-06-15 12:17:58,653: INFO: successful join

I attached two logs with debuglevel 10 that I plan to delete soonish though. samba-container run winbindd.log net ads testjoin.log

phlogistonjohn commented 2 years ago

Thank you. I've downloaded the logs locally and will look at them soon.

phlogistonjohn commented 2 years ago

Nothing is jumping out at me. I've asked a few of the team members to also look at the logs.

synarete commented 2 years ago

@turricum and @ibotty I built a custom samba-operator with patches of #216 and tested it on my OpenShift4.8 deployment. An image is available at: quay.io/ssharon/sink:openshift. Could you give it a try? Please note that currently, when you deploy on Openshift, you also need to create a (minimal) SmbCommonConfig:

apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbCommonConfig
metadata:
  name: smbcommonconfig
spec:
turricum commented 2 years ago

there is a SmbCommonConfig. I need it for external access ...

here again:

---
apiVersion: samba-operator.samba.org/v1alpha1
kind: SmbCommonConfig
metadata:
  name: freigabe
  namespace: samba-shares
spec:
  network:
    publish: external
---
synarete commented 2 years ago

You are correct -- missed it.

synarete commented 2 years ago

@turricum Could you please re-try your config using namespace: samba-operator-system for you resources? (that is, configure SmbSecurityConfig, SmbCommonConfig and SmbShare within same namespace as operator).

turricum commented 2 years ago

@synarete i removed the old configuration and created the same in the samba-operator-system .. 100% same result.

phlogistonjohn commented 2 years ago

After talking to another member of our team, we'd also like to confirm that the machine account is valid. This is stored in a tdb file that needs to be accessible across multiple containers in the pod. One command to try is net ads status -P -d10

turricum commented 2 years ago
sh-5.1# net ads status -P -d10
...............
Processing section "[global]"
doing parameter disable spoolss = yes
doing parameter fileid:algorithm = fsid
doing parameter load printers = no
doing parameter printcap name = /dev/null
doing parameter printing = bsd
doing parameter smb ports = 445
doing parameter vfs objects = fileid
doing parameter idmap config * : backend = autorid
doing parameter idmap config * : range = 2000-9999999
doing parameter realm = AD.DOMAIN.COM
doing parameter security = ads
doing parameter workgroup = AD
doing parameter netbios name = testshare
lp_servicenumber: couldn't find homes
added interface eth0 ip=172.20.2.44 bcast=172.20.3.255 netmask=255.255.252.0
ldb: ltdb: tdb(/var/lib/samba/private/secrets.ldb): tdb_open_ex: could not open file /var/lib/samba/private/secrets.ldb: No such file or directory

ldb: Unable to open tdb '/var/lib/samba/private/secrets.ldb': No such file or directory
ldb: Failed to connect to '/var/lib/samba/private/secrets.ldb' with backend 'tdb': Unable to open tdb '/var/lib/samba/private/secrets.ldb': No such file or directory
Could not find machine account in secrets database: Failed to fetch machine account password for AD from both secrets.ldb (Could not open secrets.ldb) and from /var/lib/samba/private/secrets.tdb: NT_STATUS_CANT_ACCESS_DOMAIN_INFO
_samba_cmd_set_machine_account_s3: cli_credentials_set_machine_account_db_ctx failed: NT_STATUS_CANT_ACCESS_DOMAIN_INFO
Failed to set machine account: NT_STATUS_CANT_ACCESS_DOMAIN_INFO
sh-5.1# ls -la /var/lib/samba/private/*
-rw-------. 1 root root   8888 Jun 16 18:29 /var/lib/samba/private/netlogon_creds_cli.tdb
-rw-------. 1 root root 421888 Jun 16 18:29 /var/lib/samba/private/passdb.tdb
-rw-------. 1 root root 430080 Jun 16 18:23 /var/lib/samba/private/secrets.tdb
sh-5.1# smbclient --version
Version 4.15.7
phlogistonjohn commented 2 years ago

The log line(s) stating that fetching the machine account password failed makes me wonder if there's a problem sharing the persistent storage between the must-join container and the smbd/winbind containers. Can you get full YAML dumps for the pod in question so I can see what volumes and mounts got created for this instance? Include the deployment too just for the sake of completeness. Thanks!

turricum commented 2 years ago

here is the complete yaml dump of the pod ...

# oc get pod
apiVersion: v1
kind: Pod
metadata:
  annotations:
    k8s.ovn.org/pod-networks: '{"default":{"ip_addresses":["172.20.2.44/22"],"mac_address":"0a:58:ac:14:02:2c","gateway_ips":["172.20.0.1"],"ip_address":"172.20.2.44/22","gateway_ip":"172.20.0.1"}}'
    k8s.v1.cni.cncf.io/network-status: |-
      [{
          "name": "ovn-kubernetes",
          "interface": "eth0",
          "ips": [
              "172.20.2.44"
          ],
          "mac": "0a:58:ac:14:02:2c",
          "default": true,
          "dns": {}
      }]
    k8s.v1.cni.cncf.io/networks-status: |-
      [{
          "name": "ovn-kubernetes",
          "interface": "eth0",
          "ips": [
              "172.20.2.44"
          ],
          "mac": "0a:58:ac:14:02:2c",
          "default": true,
          "dns": {}
      }]
    kubectl.kubernetes.io/default-container: samba
    kubectl.kubernetes.io/default-logs-container: samba
    openshift.io/scc: anyuid
  creationTimestamp: "2022-06-16T18:23:24Z"
  generateName: testshare-676ddff74f-
  labels:
    app: samba
    app.kubernetes.io/component: smbd
    app.kubernetes.io/instance: samba-testshare
    app.kubernetes.io/managed-by: samba-operator
    app.kubernetes.io/name: samba
    app.kubernetes.io/part-of: samba
    pod-template-hash: 676ddff74f
    samba-operator.samba.org/service: testshare
  name: testshare-676ddff74f-rn6l8
  namespace: samba-operator-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: testshare-676ddff74f
    uid: b5f562df-07da-4760-95b6-e29fd18ee193
  resourceVersion: "201914097"
  uid: 9be98c8d-79b2-47d8-9b05-cde6ac07e2b1
spec:
  containers:
  - args:
    - run
    - smbd
    command:
    - samba-container
    env:
    - name: SAMBA_CONTAINER_ID
      value: testshare
    - name: SAMBACC_CONFIG
      value: /etc/container-config/config.json
    - name: SAMBA_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SAMBA_POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/samba.org/samba-server:v0.2
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      tcpSocket:
        port: 445
      timeoutSeconds: 1
    name: samba
    ports:
    - containerPort: 445
      name: smb
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      tcpSocket:
        port: 445
      timeoutSeconds: 1
    resources: {}
    securityContext:
      capabilities:
        drop:
        - MKNOD
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/container-config
      name: samba-container-config
    - mountPath: /var/lib/samba
      name: samba-state-dir
    - mountPath: /run/samba/winbindd
      name: samba-wb-sockets-dir
    - mountPath: /mnt/0080b8bf-21ee-4225-b02b-e0c315c688f2
      name: testshare-pvc-smb
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-q9wjh
      readOnly: true
  - args:
    - run
    - winbindd
    env:
    - name: SAMBA_CONTAINER_ID
      value: testshare
    - name: SAMBACC_CONFIG
      value: /etc/container-config/config.json
    - name: SAMBA_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SAMBA_POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/samba.org/samba-server:v0.2
    imagePullPolicy: IfNotPresent
    livenessProbe:
      exec:
        command:
        - samba-container
        - check
        - winbind
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    name: wb
    resources: {}
    securityContext:
      capabilities:
        drop:
        - MKNOD
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/container-config
      name: samba-container-config
    - mountPath: /var/lib/samba
      name: samba-state-dir
    - mountPath: /run/samba/winbindd
      name: samba-wb-sockets-dir
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-q9wjh
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  imagePullSecrets:
  - name: samba-dockercfg-2drvj
  initContainers:
  - args:
    - init
    env:
    - name: SAMBA_CONTAINER_ID
      value: testshare
    - name: SAMBACC_CONFIG
      value: /etc/container-config/config.json
    - name: SAMBA_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SAMBA_POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/samba.org/samba-server:v0.2
    imagePullPolicy: IfNotPresent
    name: init
    resources: {}
    securityContext:
      capabilities:
        drop:
        - MKNOD
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/container-config
      name: samba-container-config
    - mountPath: /var/lib/samba
      name: samba-state-dir
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-q9wjh
      readOnly: true
  - args:
    - ensure-share-paths
    env:
    - name: SAMBA_CONTAINER_ID
      value: testshare
    - name: SAMBACC_CONFIG
      value: /etc/container-config/config.json
    - name: SAMBA_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SAMBA_POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    image: quay.io/samba.org/samba-server:v0.2
    imagePullPolicy: IfNotPresent
    name: ensure-share-paths
    resources: {}
    securityContext:
      capabilities:
        drop:
        - MKNOD
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/container-config
      name: samba-container-config
    - mountPath: /var/lib/samba
      name: samba-state-dir
    - mountPath: /run/samba/winbindd
      name: samba-wb-sockets-dir
    - mountPath: /mnt/0080b8bf-21ee-4225-b02b-e0c315c688f2
      name: testshare-pvc-smb
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-q9wjh
      readOnly: true
  - args:
    - must-join
    env:
    - name: SAMBA_CONTAINER_ID
      value: testshare
    - name: SAMBACC_CONFIG
      value: /etc/container-config/config.json
    - name: SAMBA_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: SAMBA_POD_NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: SAMBACC_JOIN_FILES
      value: /var/tmp/join/0/join.json
    image: quay.io/samba.org/samba-server:v0.2
    imagePullPolicy: IfNotPresent
    name: must-join
    resources: {}
    securityContext:
      capabilities:
        drop:
        - MKNOD
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/container-config
      name: samba-container-config
    - mountPath: /var/lib/samba
      name: samba-state-dir
    - mountPath: /var/tmp/join/0
      name: join-data-0
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-q9wjh
      readOnly: true
  nodeName: node-004.domain.com
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    seLinuxOptions:
      level: s0:c28,c7
  serviceAccount: samba
  serviceAccountName: samba
  shareProcessNamespace: true
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      name: testshare
    name: samba-container-config
  - emptyDir: {}
    name: samba-state-dir
  - emptyDir:
      medium: Memory
    name: samba-wb-sockets-dir
  - name: testshare-pvc-smb
    persistentVolumeClaim:
      claimName: testshare-pvc
  - name: join-data-0
    secret:
      defaultMode: 420
      items:
      - key: join.json
        path: join.json
      secretName: join1
  - name: kube-api-access-q9wjh
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          items:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2022-06-16T18:23:46Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2022-06-20T07:45:24Z"
    message: 'containers with unready status: [wb]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2022-06-20T07:45:24Z"
    message: 'containers with unready status: [wb]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2022-06-16T18:23:24Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://0f4c7963d0ed459789d0b99b3d24ca855eff3c2290ec0a36b3d12932bb1e82d3
    image: quay.io/samba.org/samba-server:v0.2
    imageID: quay.io/samba.org/samba-server@sha256:710af13506f4c19a151dbc865ce8c726550c99df18128fa0ecdce91b7558f40e
    lastState: {}
    name: samba
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2022-06-16T18:23:46Z"
  - containerID: cri-o://ef47a7736ccd2436049ad4d40b8b9757f4bb5e8fe6581e0dbe51b59980a932f3
    image: quay.io/samba.org/samba-server:v0.2
    imageID: quay.io/samba.org/samba-server@sha256:710af13506f4c19a151dbc865ce8c726550c99df18128fa0ecdce91b7558f40e
    lastState:
      terminated:
        containerID: cri-o://ef47a7736ccd2436049ad4d40b8b9757f4bb5e8fe6581e0dbe51b59980a932f3
        exitCode: 1
        finishedAt: "2022-06-20T10:54:33Z"
        reason: Error
        startedAt: "2022-06-20T10:54:33Z"
    name: wb
    ready: false
    restartCount: 1043
    started: false
    state:
      waiting:
        message: back-off 5m0s restarting failed container=wb pod=testshare-676ddff74f-rn6l8_samba-operator-system(9be98c8d-79b2-47d8-9b05-cde6ac07e2b1)
        reason: CrashLoopBackOff
  hostIP: 192.168.50.222
  initContainerStatuses:
  - containerID: cri-o://f24ed32a843c9eb989d679fdbc919a380eee15ba75a0694978e4c7e2420cc5d6
    image: quay.io/samba.org/samba-server:v0.2
    imageID: quay.io/samba.org/samba-server@sha256:710af13506f4c19a151dbc865ce8c726550c99df18128fa0ecdce91b7558f40e
    lastState: {}
    name: init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: cri-o://f24ed32a843c9eb989d679fdbc919a380eee15ba75a0694978e4c7e2420cc5d6
        exitCode: 0
        finishedAt: "2022-06-16T18:23:43Z"
        reason: Completed
        startedAt: "2022-06-16T18:23:43Z"
  - containerID: cri-o://b653c43a5db769e24b00ace964c2005fdc3f6ba55c824d472af58433e1ecc0f7
    image: quay.io/samba.org/samba-server:v0.2
    imageID: quay.io/samba.org/samba-server@sha256:710af13506f4c19a151dbc865ce8c726550c99df18128fa0ecdce91b7558f40e
    lastState: {}
    name: ensure-share-paths
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: cri-o://b653c43a5db769e24b00ace964c2005fdc3f6ba55c824d472af58433e1ecc0f7
        exitCode: 0
        finishedAt: "2022-06-16T18:23:44Z"
        reason: Completed
        startedAt: "2022-06-16T18:23:44Z"
  - containerID: cri-o://69fa8c4b84fa6cb4c4863ba951328d5e7a3fdcccf77aa4231784a0ea31bd3664
    image: quay.io/samba.org/samba-server:v0.2
    imageID: quay.io/samba.org/samba-server@sha256:710af13506f4c19a151dbc865ce8c726550c99df18128fa0ecdce91b7558f40e
    lastState: {}
    name: must-join
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: cri-o://69fa8c4b84fa6cb4c4863ba951328d5e7a3fdcccf77aa4231784a0ea31bd3664
        exitCode: 0
        finishedAt: "2022-06-16T18:23:45Z"
        reason: Completed
        startedAt: "2022-06-16T18:23:45Z"
  phase: Running
  podIP: 172.20.2.44
  podIPs:
  - ip: 172.20.2.44
  qosClass: BestEffort
  startTime: "2022-06-16T18:23:24Z"
# oc get deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
  creationTimestamp: "2022-06-16T18:11:11Z"
  generation: 3
  labels:
    app: samba
    app.kubernetes.io/component: smbd
    app.kubernetes.io/instance: samba-testshare
    app.kubernetes.io/managed-by: samba-operator
    app.kubernetes.io/name: samba
    app.kubernetes.io/part-of: samba
    samba-operator.samba.org/service: testshare
  name: testshare
  namespace: samba-operator-system
  ownerReferences:
  - apiVersion: samba-operator.samba.org/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: SmbShare
    name: testshare
    uid: 0080b8bf-21ee-4225-b02b-e0c315c688f2
  resourceVersion: "201959982"
  uid: d856b0b4-8193-42ca-a0e9-72b13612dc56
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: samba
      app.kubernetes.io/component: smbd
      app.kubernetes.io/instance: samba-testshare
      app.kubernetes.io/managed-by: samba-operator
      app.kubernetes.io/name: samba
      app.kubernetes.io/part-of: samba
      samba-operator.samba.org/service: testshare
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        kubectl.kubernetes.io/default-container: samba
        kubectl.kubernetes.io/default-logs-container: samba
      creationTimestamp: null
      labels:
        app: samba
        app.kubernetes.io/component: smbd
        app.kubernetes.io/instance: samba-testshare
        app.kubernetes.io/managed-by: samba-operator
        app.kubernetes.io/name: samba
        app.kubernetes.io/part-of: samba
        samba-operator.samba.org/service: testshare
    spec:
      containers:
      - args:
        - run
        - smbd
        command:
        - samba-container
        env:
        - name: SAMBA_CONTAINER_ID
          value: testshare
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:v0.2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1
        name: samba
        ports:
        - containerPort: 445
          name: smb
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 445
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /run/samba/winbindd
          name: samba-wb-sockets-dir
        - mountPath: /mnt/0080b8bf-21ee-4225-b02b-e0c315c688f2
          name: testshare-pvc-smb
      - args:
        - run
        - winbindd
        env:
        - name: SAMBA_CONTAINER_ID
          value: testshare
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:v0.2
        imagePullPolicy: IfNotPresent
        livenessProbe:
          exec:
            command:
            - samba-container
            - check
            - winbind
          failureThreshold: 3
          periodSeconds: 10
          successThreshold: 1
          timeoutSeconds: 1
        name: wb
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /run/samba/winbindd
          name: samba-wb-sockets-dir
      dnsPolicy: ClusterFirst
      initContainers:
      - args:
        - init
        env:
        - name: SAMBA_CONTAINER_ID
          value: testshare
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:v0.2
        imagePullPolicy: IfNotPresent
        name: init
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
      - args:
        - ensure-share-paths
        env:
        - name: SAMBA_CONTAINER_ID
          value: testshare
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        image: quay.io/samba.org/samba-server:v0.2
        imagePullPolicy: IfNotPresent
        name: ensure-share-paths
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /run/samba/winbindd
          name: samba-wb-sockets-dir
        - mountPath: /mnt/0080b8bf-21ee-4225-b02b-e0c315c688f2
          name: testshare-pvc-smb
      - args:
        - must-join
        env:
        - name: SAMBA_CONTAINER_ID
          value: testshare
        - name: SAMBACC_CONFIG
          value: /etc/container-config/config.json
        - name: SAMBA_POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: SAMBA_POD_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: SAMBACC_JOIN_FILES
          value: /var/tmp/join/0/join.json
        image: quay.io/samba.org/samba-server:v0.2
        imagePullPolicy: IfNotPresent
        name: must-join
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /etc/container-config
          name: samba-container-config
        - mountPath: /var/lib/samba
          name: samba-state-dir
        - mountPath: /var/tmp/join/0
          name: join-data-0
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: samba
      serviceAccountName: samba
      shareProcessNamespace: true
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          name: testshare
        name: samba-container-config
      - emptyDir: {}
        name: samba-state-dir
      - emptyDir:
          medium: Memory
        name: samba-wb-sockets-dir
      - name: testshare-pvc-smb
        persistentVolumeClaim:
          claimName: testshare-pvc
      - name: join-data-0
        secret:
          defaultMode: 420
          items:
          - key: join.json
            path: join.json
          secretName: join1
status:
  conditions:
  - lastTransitionTime: "2022-06-16T18:22:42Z"
    lastUpdateTime: "2022-06-16T18:22:42Z"
    message: ReplicaSet "testshare-676ddff74f" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2022-06-20T12:11:35Z"
    lastUpdateTime: "2022-06-20T12:11:35Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  observedGeneration: 3
  replicas: 1
  unavailableReplicas: 1
  updatedReplicas: 1`
synarete commented 2 years ago

@turricum you are using ovn-kubernetes as your CNI. Could you please try with openshift-sdn? Please none that OpenShify4.8 documentation clearly states that OVN-Kubernetes has its limitations: https://docs.openshift.com/container-platform/4.8/networking/ovn_kubernetes_network_provider/about-ovn-kubernetes.html

ibotty commented 2 years ago

Switching to openshift-sdn is not easy. This is a production cluster.

Reading the limitations, I don't see they are relevant here. The configured share does not meet them:

Am I missing anything?

We plan to update to 4.10 so there are even fewer limitations though.

synarete commented 2 years ago

You are using SmbCommonConfig.spec.network.publish: external. This, in turn, affects the ServiceType of the Service fronting the SMB port. However, I am not sure if it may cause different behavior in case of ovn-kubernetes.

phlogistonjohn commented 2 years ago

I've been reading through the YAML dumps. I wanted to ensure that the samba-state-dir (/var/lib/samba) was correctly being shared accross the must-join init container and the smbd & winbind containers. It appears that they are, so I don't think this issue is caused by the must-join container and the server containers not sharing the samba tdb files.

It would have been an unfortunate bug, but it would have also been a simple explanation. But I guess it's not that simple.

Since I'm focused on trying to understand why the machine account password doesn't seem to be present even though the must-join pod reports success, I'm going to ask a few more q's:

This is one of those issues that may be easier if I had a reproducer locally, but right now I'm still guessing a bit. With some of the details you provide maybe @synarete and I can create a OpenShift based setup that looks more like yours and see if we can reproduce the problem you see...

phlogistonjohn commented 2 years ago

I shouldn't assume you are familiar with the samba net command. In my comment above the example command I'm suggesting would be like: net ads join -U <DOMAIN_User> --no-dns-updates. The name should match the name you provided in the join secret. Thanks.

synarete commented 2 years ago

@phlogistonjohn I use (almost) the same yamls as @turricum provided above, but failed to reproduce. The differences: 1) I deploy OpenShift4.9 over AWS with openshift-sdn 2) Does not set SmbCommonConfig.spec.network.publish to external.

phlogistonjohn commented 2 years ago

@synarete I could possibly see 1 somehow interacting weird with AD but I'd be very surprised if 2 was a cause. Let's also find out more about how their AD is set up.

turricum commented 2 years ago

sorry for taking so long and thank you for your help!

[openshift-sdn vs ovn-kubernetes]

I tested on an OKD 4.10 cluster with openshift-sdn cni. The same problem happens.

How long does it take for the pod to exhibit join issues? Do the errors happen immediately as winbind (or smbd) start or does it take some time?

it crashes immediately after creation of the pod. It seems there is no timeout involved.

Are you using a windows based AD (or Samba based), what version?

We are using a Samba 4.12 Domain Controller. Is working fine for Windows 10 and Fedora 36 computers.

bash-5.0# smbstatus

Samba version 4.12.0
PID     Username     Group        Machine                                   Protocol Version  Encryption           Signing              
----------------------------------------------------------------------------------------------------------------------------------------
6505    3000196      3000016      10.0.2.70 (ipv4:10.0.2.70:64037)          SMB3_11           -                    AES-128-CMAC         
6506    3000325      3000016      10.0.1.24 (ipv4:10.0.1.24:57720)          SMB3_11           -                    AES-128-CMAC         

Service      pid     Machine       Connected at                     Encryption   Signing     
---------------------------------------------------------------------------------------------
IPC$         6506    10.0.1.24     Fri Jun 24 12:15:29 2022 UTC     -            AES-128-CMAC
IPC$         6505    10.0.2.70     Fri Jun 24 12:15:26 2022 UTC     -            AES-128-CMAC

No locked files

what happens if you try to manually join the AD by running net ads join

if I rsh to the pod's samba-container container (which is working, the wb container is crashing)

sh-5.1# net ads join -U domain-join --no-dns-updates
Password for [AD\domain-join]:
dos charset 'CP850' unavailable - using ASCII
dos charset 'CP850' unavailable - using ASCII
Using short domain name -- DOM
Joined 'TESTSHARE' to dns domain 'ad.domain.tld'

Does this issue reproduce on every share (and thus pod) you deploy or just some?

It happens with every AD smbshare created. User mode shares created with an older samba-operator version are working fine

phlogistonjohn commented 2 years ago

sorry for taking so long and thank you for your help!

I'm the one who should be saying that. :-) Unfortunately I've been unable to reproduce the issue so far, but I certainly have not forgotten. I have been talking to some people about ways I might get more resources to more closely replicate the set up you have described. In the mean time, please bear with us and excuse the slowness.

if I rsh to the pod's samba-container container (which is working, the wb container is crashing)

I forgot to ask earlier, if you do this and the join is successful does the wb container continue crashing? Does the behavior of net ads testjoin -P run immediately after and/or after a few minutes change?

turricum commented 2 years ago
# oc rsh testshare-7bc954446b-h8867 
# net ads testjoin -P
Failed to set machine account: NT_STATUS_CANT_ACCESS_DOMAIN_INFO

#sh-5.1# net ads join -U domain-join --no-dns-updates
Password for [AD\domain-join]:
dos charset 'CP850' unavailable - using ASCII
dos charset 'CP850' unavailable - using ASCII
Using short domain name -- DMN
Joined 'TESTSHARE' to dns domain 'ad.domain.tld'

# net ads testjoin -P
Join is OK

# sleep 600

# net ads testjoin -P
Failed to set machine account: NT_STATUS_CANT_ACCESS_DOMAIN_INFO
phlogistonjohn commented 2 years ago

Very interesting, thanks! I'll be sharing that one with the other members of my team.

ibotty commented 1 year ago

Anything else we can investigate?

synarete commented 1 year ago

@ibotty Does this problem reproduces with latest version as well? That is, when using images with samba 4.16.5 ?