OSD status created without actual OSD created

shintiger commented 10 months ago

Is this a bug report or feature request?

Bug Report

Deviation from expected behavior:

no any OSD created after the osd-prepare job succeeded

Expected behavior:

OSD deployment created after the osd-prepare job suceeded

How to reproduce it (minimal and precise):

File(s) to submit:

Cluster CR (custom resource), typically called cluster.yaml, if necessary

Logs to submit: osd-prepare-job-updated.txt

Operator's logs, if necessary
Crashing pod(s) logs, if necessary

To get logs, use kubectl -n <namespace> logs <pod name> When pasting logs, always surround them with backticks or use the insert code button from the Github UI. Read GitHub documentation if you need help.

Cluster Status to submit: cluster: id: da80f118-0452-483b-8b13-fffeb04ec0aa health: HEALTH_WARN 1 MDSs report slow metadata IOs mon a is low on available space 1 osds down 5 osds exist in the crush map but not in the osdmap Reduced data availability: 36 pgs inactive, 36 pgs peering, 43 pgs stale Degraded data redundancy: 25170/112530 objects degraded (22.367%), 39 pgs degraded, 38 pgs undersized 70 pgs not deep-scrubbed in time 70 pgs not scrubbed in time 203 daemons have recently crashed 14 slow ops, oldest one blocked for 293 sec, daemons [osd.1,osd.10] have slow ops.

services: mon: 2 daemons, quorum a,b (age 2d) mgr: b(active, since 42h), standbys: a mds: 1/1 daemons up, 1 hot standby osd: 9 osds: 2 up (since 30h), 6 in (since 7m); 4 remapped pgs

data: volumes: 1/1 healthy pools: 4 pools, 81 pgs objects: 48.54k objects, 129 GiB usage: 98 GiB used, 2.9 TiB / 3.0 TiB avail pgs: 44.444% pgs not active 25170/112530 objects degraded (22.367%) 1087/112530 objects misplaced (0.966%) 33 stale+peering 27 active+undersized+degraded 10 stale+active+undersized+degraded 3 active+clean 3 remapped+peering 2 active+clean+laggy 1 active+recovering+degraded+remapped 1 active+clean+remapped 1 active+undersized+degraded+laggy

Output of kubectl commands, if necessary

To get the health of the cluster, use kubectl rook-ceph health To get the status of the cluster, use kubectl rook-ceph ceph status For more details, see the Rook kubectl Plugin

Environment:

OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a): Linux eq12 5.15.0-89-generic #99-Ubuntu SMP Mon Oct 30 20:42:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Cloud provider or hardware configuration:
on premises
Rook version (use rook version inside of a Rook Pod):
Storage backend version (e.g. for ceph do ceph -v): ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Kubernetes version (use kubectl version): 1.25.5
Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):

I have done following:

Wipe LVM, remove, recreate LVM and restart operator
Switch another hardware storage and redo above

The important errors I have found from the log was:

stderr: got monmap epoch 2
stderr: 2023-11-28T07:05:29.267+0000 7f29176d23c0 -1 asok(0x557295990000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.13.asok': (13) Permission denied stderr: 2023-11-28T07:05:29.267+0000 7f29176d23c0 -1 bluestore(/var/lib/ceph/osd/ceph-13/) _read_fsid unparsable uuid

I found similar issues for "_read_fsid unparsable uuid" but all of that is already exists, non of those are Permission denied, but the logs show me seems that is not breaking the job, osd profile stil created (log message) Resource limit seems not related, my version already disabled limit, the node should have sufficient resources (16GB)

kubectl get -n rook-ceph cm/rook-ceph-osd-eq12-status -o yaml

apiVersion: v1
data:
  status: '{"osds":[{"id":4,"cluster":"ceph","uuid":"7052383a-92eb-4f19-83cd-de59f9c5a20d","device-part-uuid":"","device-class":"ssd","lv-path":"/dev/mapper/ubuntu--vg-ceph","metadata-path":"","wal-path":"","skip-lv-release":true,"location":"root=default
    host=eq12","lv-backed-pv":false,"lv-mode":"raw","store":"bluestore","topologyAffinity":"","encrypted":false,"exportService":false}],"status":"completed","pvc-backed-osd":false,"message":""}'
kind: ConfigMap
metadata:
  creationTimestamp: "2023-11-29T05:54:23Z"
  labels:
    app: rook-ceph-osd
    node: eq12
    status: provisioning
  name: rook-ceph-osd-eq12-status
  namespace: rook-ceph
  ownerReferences:
  - apiVersion: ceph.rook.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: CephCluster
    name: rook-ceph
    uid: 256c5fe4-9677-4cc6-9c46-8b695dbae090
  resourceVersion: "52857676"
  uid: 96d51fd2-0bc5-4288-84e2-5b6daaa18289

travisn commented 10 months ago

Some questions about the status:

Usually 3 mons are recommended. Is it expected you only have two?
Only 2 OSDs are up, with 9 total created. Were all 9 OSDs previously up, and now they are not healthy? Or did you just attempt to create 7 more OSDs and they aren't coming up?

7 OSDs have been provisioned but are not running. Are they all getting that admin socket error?

services:
mon: 2 daemons, quorum a,b (age 2d)
mgr: b(active, since 42h), standbys: a
mds: 1/1 daemons up, 1 hot standby
osd: 9 osds: 2 up (since 30h), 6 in (since 7m); 4 remapped pgs

Before trying to add more OSDs, you'll need to troubleshoot why those OSDs are not starting.

shintiger commented 10 months ago

2 mons is my cluster settings at the beginning because I have 2 worker node only at that time.

Later I have 1 control plane + 3 worker, one of the worker disk failure about once a month, so I reinstall Ubuntu 22 with same hostname, ip address and join the cluster. Every time I do this process, I just ignored the old OSD because I am not sure how to safely remove since everything was fine.

But this time another worker node’s OSD won’t up, the osd container in the pod is keep restarting, end up I simply wipe the lvm for Ceph on that node and restart rook operator, the issue comes up.

On Thu, Nov 30, 2023 at 07:03 Travis Nielsen @.***> wrote:

Some questions about the status:

Usually 3 mons are recommended. Is it expected you only have two?

Only 2 OSDs are up, with 9 total created. Were all 9 OSDs previously up, and now they are not healthy? Or did you just attempt to create 7 more OSDs and they aren't coming up?

7 OSDs have been provisioned but are not running. Are they all getting that admin socket error?

services: mon: 2 daemons, quorum a,b (age 2d) mgr: b(active, since 42h), standbys: a mds: 1/1 daemons up, 1 hot standby osd: 9 osds: 2 up (since 30h), 6 in (since 7m); 4 remapped pgs

Before trying to add more OSDs, you'll need to troubleshoot why those OSDs are not starting.

— Reply to this email directly, view it on GitHub https://github.com/rook/rook/issues/13289#issuecomment-1832838314, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2EHY3WDMZPAXXLEBDCAM3YG65KXAVCNFSM6AAAAAA76Z4UBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSHAZTQMZRGQ . You are receiving this because you authored the thread.Message ID: @.***>

travisn commented 10 months ago

To remove an OSD, see the OSD management guide so you can get rid of those old OSDs. You'll need to fully purge an OSD before it can be re-created after you wipe the lvm. If you can purge all the old OSDs that you don't need, it will be more clear where to look for the issue.

shintiger commented 10 months ago

I get rid of this issue now. The permission denied does matter, I have check /var/lib/rook/exporter on host storing all .asock, and the rest of working host owner is 167:167(ceph:ceph) but that not working node is root:root. After I set mon to 3 from 2, the OSD spawn, I don't know what these are related but I can confirm workaround is mannual switch /var/lib/rook/exporter owner to 167 on host.

I also can't find reference /var/run/ceph in osd-prepare pod is related to host /var/lib/rook/exporter.

rook / rook

OSD status created without actual OSD created #13289