scylladb / scylla-operator

The Kubernetes Operator for ScyllaDB
https://operator.docs.scylladb.com/
Apache License 2.0
323 stars 159 forks source link

NodeConfig should be in a degraded state when mount already exists and is not identical #1909

Open rzetelskik opened 1 month ago

rzetelskik commented 1 month ago

What happened?

If you define a NodeConfig such that it should mount the raid array over an existing mount, it won't complain, and the NodeConfig will not end up in a degraded state.

$ systemctl status mnt-test.mount
● mnt-test.mount - Managed mount by Scylla Operator
     Loaded: loaded (/proc/self/mountinfo; enabled; vendor preset: enabled)
     Active: active (mounted) since Wed 2024-05-01 20:04:50 UTC; 5min ago
      Where: /mnt/test
       What: /dev/mapper/ubuntu--vg-ubuntu--lv
      Tasks: 0 (limit: 4557)
     Memory: 48.0K
        CPU: 5ms
     CGroup: /system.slice/mnt-test.mount

May 01 20:04:50 ubuntu systemd[1]: Mounting Managed mount by Scylla Operator...
May 01 20:04:50 ubuntu systemd[1]: Mounted Managed mount by Scylla Operator.
$ kubectl logs -n scylla-operator-node-tuning daemonsets/cluster-node-setup
...
I0501 20:15:08.529961       1 nodesetup/sync_raids.go:86] "RAID0 array has been created" RAIDName="r6bc9qjw" Devices="/dev/loop10"
I0501 20:15:08.530029       1 record/event.go:376] "Event occurred" object="cluster" fieldPath="" kind="NodeConfig" apiVersion="scylla.scylladb.com/v1alpha1" type="Normal" reason="RAIDCreated" message="RAID0 array \"r6bc9qjw\" using /dev/loop10 devices has been created"
I0501 20:15:08.541078       1 nodesetup/sync_filesystems.go:56] "Filesystem has been created" Device="r6bc9qjw" Filesystem="xfs"
I0501 20:15:08.541188       1 nodesetup/sync_mounts.go:46] "Mount unit has been generated and queued for apply." Name="mnt-test.mount" Device="r6bc9qjw" MountPoint="/mnt/test"
I0501 20:15:08.541259       1 systemd/unit.go:108] "Checking if units need pruning" Existing=1 Desired=1
I0501 20:15:08.541294       1 record/event.go:376] "Event occurred" object="cluster" fieldPath="" kind="NodeConfig" apiVersion="scylla.scylladb.com/v1alpha1" type="Normal" reason="FilesystemCreated" message="xfs filesystem has been created device r6bc9qjw"
I0501 20:15:09.037629       1 systemd/unit.go:169] "Ensuring unit" Name="mnt-test.mount"
I0501 20:15:09.467564       1 systemd/unit.go:206] "Enabling and starting unit" Name="mnt-test.mount"
I0501 20:15:09.471249       1 nodesetup/status.go:44] "Updating status" NodeConfig="cluster" Node="ubuntu"

/kind bug

What did you expect to happen?

NodeConfig should end up in a degraded state and the controller should log an error.

$ systemctl status mnt-test.mount
● mnt-test.mount - Managed mount by Scylla Operator
     Loaded: loaded (/proc/self/mountinfo; enabled; vendor preset: enabled)
     Active: active (mounted) since Wed 2024-05-01 20:04:50 UTC; 22s ago
      Where: /mnt/test
       What: /dev/md124
      Tasks: 0 (limit: 4557)
     Memory: 1.7M
        CPU: 5ms
     CGroup: /system.slice/mnt-test.mount

May 01 20:04:50 ubuntu systemd[1]: Mounting Managed mount by Scylla Operator...
May 01 20:04:50 ubuntu systemd[1]: Mounted Managed mount by Scylla Operator.

How can we reproduce it (as minimally and precisely as possible)?

On host:

$ mount --bind $( mktemp -d ) /mnt/test/

Create a NodeConfig:

apiVersion: scylla.scylladb.com/v1alpha1
kind: NodeConfig
metadata:
  name: cluster
spec:
  disableOptimizations: false
  localDiskSetup:
    filesystems:
    - device: lhwnjl5q
      type: xfs
    loopDevices:
    - imagePath: /mnt/disk.img
      name: disk
      size: 32M
    mounts:
    - device: lhwnjl5q
      fsType: xfs
      mountPoint: /mnt/test
      unsupportedOptions:
      - prjquota
    raids:
    - RAID0:
        devices:
          modelRegex: .*
          nameRegex: ^/dev/loops/disk$
      name: lhwnjl5q
      type: RAID0
  placement:
    nodeSelector:
      scylla.scylladb.com/node-type: scylla

Scylla Operator version

master

Kubernetes platform name and version

n/a

Please attach the must-gather archive.

n/a

Anything else we need to know?

No response

rzetelskik commented 1 month ago

Somewhat related to https://github.com/scylladb/scylla-operator/issues/1334, although the status here is active (mounted).