rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.45k stars 258 forks source link

Errors when uninstalling rke2 from windows agent #5778

Closed manuelbuil closed 1 month ago

manuelbuil commented 3 months ago

Environmental Info: RKE2 Version:

Node(s) CPU architecture, OS, and Version:

Cluster Configuration:

Mixed cluster with windows and linux nodes

Describe the bug:

When executing .\rke2-uninstall.ps1 sometimes we see the error:

time="2024-04-15T14:49:31Z" level=error msg="unable to delete k8s.io" error="namespace \"k8s.io\" must be empty, but it still has blobs, snapshots on \"windows\" snapshotter: failed precondition"
ctr: unable to delete k8s.io: namespace "k8s.io" must be empty, but it still has blobs, snapshots on "windows" snapshotter: failed precondition

and after that rke2 uninstallation normally fails when trying to delete the directory /var because there are containerd files that were not properly deleted. Deleting those files afterwards is impossible because Windows complains that they are locked.

Steps To Reproduce:

Expected behavior:

rke2-uninstall.ps1 removes everything installed by rke2

Actual behavior:

rke2-uninstall.ps1 sometimes fails to remove everything installed by rke2

Additional context / logs:

mdrahman-suse commented 1 month ago

Validated on master branch with commit 9eae9199

Environment

1 Linux server, 1 linux agent and 1 Windows agent (2019)

Testing

Replication

PS C:\Users\Administrator> C:\usr\local\bin\rke2.exe -v
rke2.exe version v1.30.1+rke2r1 (e7f87c6dd56fdd76a7dab58900aeea8946b2c008)
go version go1.22.2
PS C:\Users\Administrator> c:/usr/local/bin/rke2-uninstall.ps1
Beginning the uninstall process
docker.io/mbuilsuse/pstools:v0.2.0
docker.io/mbuilsuse/pstools@sha256:2f7abee3e5ecc3c672cb1054ac053dab4e149ad2e51ab3143c43106c9c6fe335
docker.io/phillipsj/pstools:v0.2.0
docker.io/phillipsj/pstools@sha256:2f7abee3e5ecc3c672cb1054ac053dab4e149ad2e51ab3143c43106c9c6fe335
docker.io/rancher/mirrored-pause:3.6
docker.io/rancher/mirrored-pause@sha256:74c4244427b7312c5b901fe0f67cbc53683d06f4f24c6faee65d4182bf0fa893
sha256:2a67292b6e8ba4e1c56ccccb591a5c75ed50daa015d16c13032110e718e1e520
sha256:9aa6faee59d33943b3a48966970f7e21382c511f705a714a7465a1bfa7f8d57f
time="2024-05-30T20:50:48Z" level=error msg="unable to delete k8s.io" error="namespace \"k8s.io\" must be empty, but it still has blobs, snapshots on \"windows\" snapshotter: failed precondition"
ctr: unable to delete k8s.io: namespace "k8s.io" must be empty, but it still has blobs, snapshots on "windows" snapshotter: failed precondition
INFO: Checking if rke2 process exists
INFO: rke2 process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: rke2(4112)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if kube-proxy process exists
INFO: kube-proxy process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: kube-proxy(6080)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if kubelet process exists
INFO: kubelet process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: kubelet(5272)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if containerd process exists
INFO: containerd process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: containerd(4664)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if wins process exists
INFO: Checking if calico-node process exists
INFO: calico-node process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: calico-node(5920)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if flanneld process exists
INFO: Checking if rke2 service exists
INFO: rke2 service found, stopping now
INFO: rke2 service has stopped. Removing the rke2 service ...
[SC] DeleteService SUCCESS
INFO: Checking if wins service exists
INFO: Cleaning c:/usr...
INFO: Cleaning c:/etc...
INFO: Cleaning c:/run...
INFO: c:/run is empty, moving on
INFO: Cleaning c:/var...
ForEach-Object : Exception calling "Delete" with "0" argument(s): "Access to the path 'Users' is denied."
At C:\usr\local\bin\rke2-uninstall.ps1:183 char:78
+ ... ir -Recurse -Attributes ReparsePoint | ForEach-Object { $_.Delete() }
+                                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [ForEach-Object], MethodInvocationException
    + FullyQualifiedErrorId : IOException,Microsoft.PowerShell.Commands.ForEachObjectCommand

Validation

PS C:\Users\Administrator> C:\usr\local\bin\rke2.exe -v
rke2.exe version v1.30.1+dev.9eae9199 (9eae91996cfde1693d4048d7767a913734537d4d)
go version go1.22.2
PS C:\Users\Administrator> C:\usr\local\bin\rke2-uninstall.ps1
Beginning the uninstall process

    Directory: C:\var\lib\rancher\rke2\bin

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----         6/6/2024   3:09 PM              0 rke2-uninstall.lock

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: kubelet(3088)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
docker.io/mbuilsuse/pstools:v0.2.0
docker.io/mbuilsuse/pstools@sha256:2f7abee3e5ecc3c672cb1054ac053dab4e149ad2e51ab3143c43106c9c6fe335
docker.io/phillipsj/pstools:v0.2.0
docker.io/phillipsj/pstools@sha256:2f7abee3e5ecc3c672cb1054ac053dab4e149ad2e51ab3143c43106c9c6fe335
docker.io/rancher/mirrored-pause:3.6
docker.io/rancher/mirrored-pause@sha256:74c4244427b7312c5b901fe0f67cbc53683d06f4f24c6faee65d4182bf0fa893
docker.io/rancher/rke2-runtime:v1.30.1-dev.9eae9199-windows-amd64
sha256:9af01c3cdd366f3d0c604dad4a53c0761bf3755ee46dc11647d50030508582bf
sha256:df474b07075ac3e5ef2bce8bd4d7fdc1d9329422906b44054f88681357caeaca
sha256:f7022dded3264c0a3954250c8214fe3f37c4ea52cb6c11f15b51d1112dcc6099
Tasks, containers, images and snapshots are being deleted. This may take a while (timeout 180s)
k8s.io
All containerd resources have been deleted
INFO: Checking if rke2 process exists
INFO: rke2 process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: rke2(1896)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if kube-proxy process exists
INFO: kube-proxy process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: kube-proxy(3700)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if kubelet process exists
INFO: Checking if containerd process exists
INFO: containerd process found, stopping now

Confirm
Are you sure you want to perform the Stop-Process operation on the following item: containerd(3740)?
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"):
INFO: Checking if wins process exists
INFO: Checking if calico-node process exists
INFO: Checking if flanneld process exists
INFO: Checking if rke2 service exists
INFO: rke2 service found, stopping now
INFO: rke2 service has stopped. Removing the rke2 service ...
[SC] DeleteService SUCCESS
INFO: Checking if wins service exists
INFO: Cleaning c:/usr...
INFO: Cleaning c:/etc...
INFO: Cleaning c:/run...
INFO: c:/run is empty, moving on
INFO: Cleaning c:/var...
INFO: Cleaning ...
INFO: Cleaning ...
INFO: Cleaning ...
INFO: Cleaning Temp Install Directory...
INFO: Cleaning RKE2 Environment Variables
INFO: Cleaning RKE2 Machine Environment Variables
INFO: Cleaning CATTLE_AGENT_BIN_PREFIX
INFO: HNS will be cleaned next, temporary network disruption may occur. HNS cleanup is the final step.
INFO: Cleaning up HnsNetwork nat ...
INFO: Cleaning up HnsNetwork Calico ...
INFO: Cleaning up HnsNetwork External ...