Open mynktl opened 3 months ago
cc @brandond
This sounds like a duplicate of https://github.com/rancher/rke2/issues/6557
@brandond As we have removed the do_unmount_and_remove
for rke2 directory in https://github.com/vitorsavian/rke2/blob/587fb7f22469a4827ea9040b36bfb23d14f9d0c5/bundle/bin/rke2-killall.sh, it will fix the problem for rke2 related directories. But do you see any need to have error handling or set -e
in do_unmount_and_remove
?
We are unmounting the directory and then removing that directory to clear the mount point. But if unmount fails then it will remove the content of directory which may result into data loss.
if unmount fails then it will remove the content of directory which may result into data loss.
That shouldn't be the case, we use rm -rf --one-file-system ${MOUNTS}
which would not traverse across the filesystem boundary into the path that failed to unmount.
If you believe you've having problems with this not working as intended, please provide steps to reproduce.
Relevant output below:
This was done with was ran with the version before /var/lib/rancher/rke2 was removed.
But seems like a good idea to do error checking in case things like this happened that you don't expect. You never know.
I don't think any changes are necessary. Please test on the release that no longer cleans up mounts under /var/lib/rancher/rke2.
Why do you think adding error checking is unnecessary? Is there some concern or do you have some coding standard or style guide you are adhering too?
The issue was not really rm -rf --one-file-system ${MOUNTS}. In your code you execute
MOUNTS=
while read ignore mount ignore; do
MOUNTS="${mount}\n${MOUNTS}"
done </proc/self/mounts
MOUNTS=$(printf ${MOUNTS} | grep "^$1" | sort -r)
The last command results in MOUNTS being a collection of directories. I believe the purpose of that command is to make sure you remove the mounts in order.
echo $MOUNTS /var/lib/rancher/rke2/server/db /var/lib/rancher/rke2
Therefore when rm is executed its running: rm -rf --one-file-system /var/lib/rancher/rke2/server/db which would remove the content of the db directory since the prior unmount command failed.
We found this issue because a customer ran it and it deleted there etcd db on the node. They were shutting down all nodes but luckily this only happened on the one node. If the script had error checking to begin with this would have never happened.
In the current iteration, no commands catch my eye as unsafe, but I am always surprised by what I miss. And perhaps in the future something bad will get introduced.
I checked some online style guides online and this seems to be a best practice:
@brandond there are still problems with the new script. It was update to not remove "${RKE2_DATA_DIR}"
But we use topolvm and it mounts different resources under /var/lib/kubelet/pods. (Not sure if other csi plugins do this)
In the same way we saw the etcd accidently get deleted we see the content of some of our stateful set PVCs get deleted.
rm: cannot remove '/var/lib/kubelet/pods/f891db50-3a58-4ba2-a1e7-cb7c90a42741/volumes/kubernetes.io~local-volume/insights-lookerdir-pv-autosuitea': Device or resource busy rm: cannot remove '/var/lib/kubelet/pods/f891db50-3a58-4ba2-a1e7-cb7c90a42741/volumes/kubernetes.io~local-volume/insights-looker-datadir-pv-autosuitea': Device or resource busy
The content of these PVs was fully deleted.
Simply adding an error check to the whole operation would fix this. I would rather not see this in the wild like we saw with etcd getting deleted.
I'll reopen this for follow-up.
If you can provide any steps for our QA team to reproduce the issue that would be appreciated. What are you using that fails to unmount, but does allow files to be removed?
@brandond I am not sure what originally caused the issue with etcd. My guess is that they have some sort of security scanning going on.
I think you could simulate this by making a host mount in a container, navigating to that directory under /var/lib/pod and then running rke2-killall.sh. Because the working directory of the shell is one of the folders being unmounted, the unmount command fails. If that does not work then you would need to isntall a CSI like topolvm.
We have not seen one of the PVs get wiped in a customer environment. But as part of our RCA we ask the question of "Can this happen again? What could have prevented this?" etc.
Since error checking was not added to the script, it was out conclusion that yes this could happen again and we probably cannot predict all the edge cases where it could happen. But loosing data in this manner is unacceptable and preventable.
Just adding set -e at the start would prevent this and as far as I am aware adding an error check is common best practice.
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.
@mynktl are you capable of testing this script change in your unique environment? My preliminary tests aren't yielding the same issues that you encountered and I'm wondering if it's related to the CSI you mentioned deploying.
Environmental Info: RKE2 Version:
rke2 version v1.30.1+rke2r1 (e7f87c6dd56fdd76a7dab58900aeea8946b2c008) go version go1.22.2 X:boringcrypto
Node(s) CPU architecture, OS, and Version:
Linux server0 4.18.0-372.41.1.el8_6.x86_64 #1 SMP Thu Jan 5 13:56:06 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 server
Describe the bug:
For RKE2, We are using two mount points, one for etcd database and other for rke2. structure of
/var/lib/rancher
directory is as below:and mount configuration is as below:
To mount
/var/lib/rancher/rke2/server/db
automatically, We have added a dependency of this db mount to rke2-server.service.So whenever
systemctl start rke2-server
is executed it will perform the db mount and start rke2 server.Issue: When we execute rke2-killall.sh script, it unmount
/var/lib/rancher/rke2/server/db
and delete this directory. As We don't have mount point specific to/var/lib/rancher/rke2
,do_unmount_and_remove
performs the action on/var/lib/rancher/rke2/server/db
because of the conditiongrep "^$1"
indo_unmount_and_remove
function.Post this execution, rke2 start via systemctl command fails, with below error
This behaviour is with selinux enabled.
As we have separate mount point for etcd db, We have one more risk of etcd data getting deleted. In rke2-killall.sh script,
umount
will always go through as this is local mount point, but missingset -e
. Ifamount
fails thenrm -rf --one-file-system ${MOUNTS}
will delete the content of the directory, which is unexpected.Steps To Reproduce:
Expected behavior:
Actual behavior:
Additional context / logs: