Closed lerminou closed 1 year ago
The drive symbol may change due to disk replacement or re-plugging, etc., but the path id
of the same disk will not change. Is it possible to use path id
instead of drive symbol when executing ceph commands? @satoru-takeuchi
Or maybe force the symlink or remove the old one, if not possible in the ceph command?
@microyahoo Although I don't recall the reason now, we should use kernel name here. I'll investigate how to resolve/mitigate your issue.
@lerminou Thank you for your hint. I'll check whether your suggestion work. It might cause a kind of race.
I'm still investigating this issue. This problem might be in ceph...
In addition to finding the root cause, I'm trying to find a workaround.
Sorry for the delay, I didn't have enough time to work on this issue.
You can resolve this problem after encountering this problem.
kubectl scale deloy rook-ceph-operator --replicas=0
kubectl scale deploy rook-ceph-osd-<osd ID> --replicas=0
/var/lib/rook/rook-ceph/<osd id>/block
kubectl scale deploy rook-ceph-osd-<osd ID> --replicas=1
kubectl scale deloy rook-ceph-operator --replicas=1
Then the new osd pod will create the correct symlink.
Hi @satoru-takeuchi, Yes this is my actual workaround, but the cluster is unavailable during the detection/fix frame
Yes this is my actual workaround,
Great.
but the cluster is unavailable during the detection/fix frame
Of course, I'm trying to create a PR to fix this problem.
The logic in which this bug exists is a bit complicated. Please wait for a while.
This problem was introduced by my commit.
The logic in which this bug exists is a bit complicated. Please wait for a while.
This problem was introduced by my commit.
@satoru-takeuchi Do you have more thoughts about how common this issue might be? Since your commit was a while ago, perhaps it is not a common case?
@travisn
I guess that it's not so common in small clusters and the possibility get higher in large clusters. This problem seems to hapens iff the target of /var/lib/ceph/ceph-<n>/block
is a non existent block device file.
Here is an example when there are two scratch devices, B and C and they are bound to device files "sdb" and "sdc".
I verified this problem actually happened in my test env. In addition, I verified that this problem didn't not to happen when flipping device names(e.g. device B is bound to "sdc" and device C bound to "sdb").
The key factor is the reduction of the number of a device andon of ".../block" files becomes dangling symlink.
Although this problem might also be in OSD on PVC, I didn't confirm yet.
My next actions are...
Does my plan make sense?
Thanks for the explanation, sounds like a good plan. When ceph-volume creates the OSD, I thought ceph would start using a symlink with the path name instead of the original device name. I am forgetting the details, but my memory doesn't match what you are describing, so I don't trust my memory.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.
I just hit this as well -- and I've seen it a few times in the past, just didn't find a solution or have time to try to track it down. thanks for the efforts to fix it!
@satoru-takeuchi How is the investigation on this issue? Thanks!
@travisn I'm testing #11567 , which resolves this issue. There are several remaining tests. I'll finish this todat.
It takes long time due to lack of my extra time and there are many test case.
Thanks a lot for the fix, I'm just waiting for the next release :)
Thanks a lot for the fix, I'm just waiting for the next release :)
v1.10.12 is out with this fix!
Is this a bug report or feature request?
Deviation from expected behavior: I'm using Rook Ceph with specifics devices, identified by ids
Linux disk letter sdX can change when rebooting, and should not break the application Actually, when starting the OSD, the init container
activate
detects the right new disk, but a symlink is already present to the old oneExpected behavior: Rook Ceph detect the good disk when the node reboot, even if the letter sdX change the symlink should be recreated
How to reproduce it (minimal and precise):
File(s) to submit:
cluster.yaml
, if necessaryLogs to submit:
Crashing pod(s) logs, if necessary
To get logs, use
kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use theinsert code
button from the Github UI. Read GitHub documentation if you need help.Cluster Status to submit:
Environment:
NAME="Red Hat Enterprise Linux" VERSION="8.6 (Ootpa)"
uname -a
):Linux vm-kube-slave-6 4.18.0-372.19.1.el8_6.x86_64 #1 SMP Mon Jul 18 11:14:02 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
rook version
inside of a Rook Pod): 1.9.7ceph -v
): filesystemkubectl version
): 1.23ceph health
in the Rook Ceph toolbox):