Open bandak2 opened 1 year ago
Can someone please help us with taking a look at this?
Our driver uses kubelet path /var/lib/kubelet
update was done using command rke up --config ./cluster.yaml
after changing kubernetes_version
If any additional info is required, please let us know.
Hi, can someone please help with the issue?
This is in relation to our Dell CSI storage driver, where an RKE upgrade is leading kubelet spamming mount error for statefulsets. Our CSI driver is in sync with the CSI spec and we would like to understand what during the upgrade process is making the mount calls for already existing mounts. This is leading to Failed mount error spamming in the kubelet logs. If you need additional info, please let us know.
Hi @bandak2, I'm working on the internal ticket for this. Could you please attach information on the csi version and any configuration details that might help us reproduce?
Thank you, TJ
Hi TJ,
The CSI spec our Dell csi-unity storage driver uses is CSI Spec 1.5. The behavior can be reproduced after we upgrade the RKE k8s version from 1.22.17 to 1.24.9; with our storage drivers installed with statefulsets running with dynamic provisioning (dynamic PVC provisioning). We were running RKE 1.3.18 version for these upgrades.
Error: MountVolume.MountDevice failed for volume \"csivol-d8ed8f41a7\" (UniqueName: \"kubernetes.io/csi/csi-unity.dellemc.com^csivol-d8ed8f41a7-iSCSI-storagearray-sv_268814\") pod \"web-0\" (UID: \"98a7514b-414b-4ddf-ab23-572e606c7244\") : rpc error: code = Internal desc = runid=34 device already in use and mounted elsewhere. Cannot do private mount\n",
sample statefulset used:
`--- apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: ports:
apiVersion: apps/v1 kind: StatefulSet metadata: name: web labels: app: nginx spec: serviceName: "nginx" selector: matchLabels: app: nginx replicas: 2 minReadySeconds: 10 template: metadata: labels: app: nginx spec: containers:
Hi TJ, Do you have an update on this, or if you need anything from us?
Hi Keerthi My apologies for the lapse in updates. Our engineer is still reviewing the issue and working to reproduce it. Just so you know a little about what we're doing, he's checking to see if this can be reproduced in environments with different situations/versions (such as 1.22->1.24 upgraded environs, directly in 1.24.9, and even in a newer version of 1.24 which we are preparing for release). Hopefully, this will turn up some information on what might be different on our end or if there's some common denominator to the failure states.
I'll see if there is any information we can provide him to speed up this part of the process. If anything comes up that you may think is relevant, please don't hesitate to let us know.
I've been reviewing things, and it may help improve the productivity of our recreations if we could double check the formatting of the statefulset you provided so as to prevent any misunderstandings on our part from propagating through our tests. Please review my edits below for accuracy:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
- ports:
port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
labels:
app: nginx
spec:
serviceName: "nginx"
selector:
matchLabels:
app: nginx
replicas: 2
minReadySeconds: 10
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: unity-iscsi
(Took me a sec to figure out but it looks like the code block syntax for github is three back ticks (```))
Thanks for the response @Tejeev . We were using the volumeClaimTemplates in the statefulset manifest under the templates section for dynamic PVC generation for the statefulset pods. This generates us a PVC using our csi-unity driver which will be used for the volume mount. I think either way would work for the test.
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: unity-iscsi
Is this correct?
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
- ports:
port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
labels:
app: nginx
spec:
serviceName: "nginx"
selector:
matchLabels:
app: nginx
replicas: 2
minReadySeconds: 10
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: unity-iscsi
seems like the identation is a little off for the claimtempalte. You can use the following which works for me:
---
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
labels:
app: nginx
spec:
serviceName: "nginx"
selector:
matchLabels:
app: nginx
replicas: 2
minReadySeconds: 10
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
storageClassName: unity-iscsi
Thanks!
I'm told we're having some difficulty securing access to the specific server hardware (and, therefore also the CSI driver) that is being used here. Can you please help us investigate by trying the following:
The key here is to try the above steps on the same hardware with the same CSI drivers, as I think trying it on different hardware with different drivers wouldn't produce useful results for this particular issue.
Apologies for the delayed response. It was a long weekend. Response to the points you raised:
We'll revert on point 1.
We tried the upgrade scenario with rke v1.3.20 from 1.22 through 1.24, and we've hit the issue once we landed in 1.24. Seems like the staging path for the node stage volume is different for 1.24 when compared through k8s 1.22 and k8s 1.23 when tested with the same version of our driver. Here are the findings
RKE v1.3.20 with k8s v1.22.17-rancher1-2 driver v2.4.0
NODE_STAGE_VOLUME path is different here
StagingTargetPath=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/csivol-50d0f5ffdd/globalmount
upgrading to v1.23.16-rancher2-2..... NODE_STAGE_VOLUME path remained the same
StagingTargetPath:/var/lib/kubelet/plugins/kubernetes.io/csi/pv/csivol-50d0f5ffdd/globalmount
...
...
time="2023-05-17T13:17:22Z" level=debug arrayid=apm00213404195 runid=25 msg="volume already published to target" func="github.com/dell/csi-unity/service.publishVolume()" file="/go/src/csi-unity/service/mount.go:364"
..upgrading from v1.23.16-rancher2-2 to v1.24.13-rancher2-1
StagingTargetPath=/var/lib/kubelet/plugins/kubernetes.io/csi/csi-unity.dellemc.com/faf27cb97b007ad80dd62a9ffe53c84c03f9ded373e527a41654ae32a31c14c7/globalmount
time="2023-05-17T13:35:41Z" level=info msg="/csi.v1.Node/NodeStageVolume: REP 0073: rpc error: code = Internal desc = runid=73 device already in use and mounted elsewhere. Cannot do private mount"
We are not sure why there is a change in the staging path when we hit k8s 1.24 version.
Hi @bandak2
I've been looking into this issue. As @Tejeev mentioned, it's difficult for us to reproduce because we don't have the Dell hardware required. From going through the code, I understand that this error is happening when NodeStageVolume
is called on your CSI (unity) driver. I believe it's kubelet that is making this call and passing an unexpected value as the StagingTargetPath
. Would you be able to check the kubelet logs to see what it's doing differently in 1.24 vs 1.22/1.23?
We're trying this on vanilla k8s and will update our findings from our end.
@bandak2 , just curious if there is any update on it?
Hi @snasovich, Thanks for checking. Yes, we've tried this on our vanilla k8s (one deployed using kubeadm) from 1.22 -> 1.23 -> 1.24 with our driver installed, and it looks like the issue is not reproducible on vanilla k8s. The orchestrator seems to be calling the NodeUnstageVolume correctly after the upgrade to 1.24, and then remounting it with the new updated path correctly using NodeStageVolume calls. I'll be sharing the logs shortly today after sanitizing them. Just a note, during k8s upgrade the kubeadm way, we cordon and drain the nodes. we follow the procedure mentioned in the Kubernetes guide to upgrade one version at a time: https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
There are too many logs to sanitize given that we are including the driver logs. We're reaching out to your team via a different channel where we can share those securely, without having to go through sanitizing all of these.
The logs for this have been provided through backend channels. Once you have some insights please let us know.
Hi @bandak2. Thanks for the logs. I've spent some time digging through them and the only meaningful thing I've found so far is (as you mentioned before), the new mount calls to a different path.
Before v1.24, all the mounts are to the mount point /var/lib/kubelet/plugins/kubernetes.io/csi/pv/krv11506-d05ec878ad/globalmount
.
E.g:
mount -t ext4 -o defaults /dev/dm-0 /var/lib/kubelet/plugins/kubernetes.io/csi/pv/krv11506-d05ec878ad/globalmount
After v1.24, I see that there are still mount calls that use the same mount point, but there are also new ones that use /var/lib/kubelet/plugins/kubernetes.io/csi/csi-unity.dellemc.com/4bb10b984d5ca1bf16a8c79733866361f7061d665cba703eef3307a3426d3b21/globalmount
.
E.g:
mount -t ext4 -o defaults /dev/dm-1 /var/lib/kubelet/plugins/kubernetes.io/csi/csi-unity.dellemc.com/4bb10b984d5ca1bf16a8c79733866361f7061d665cba703eef3307a3426d3b21/globalmount
I'm not sure if this is a red herring or not, though, because these mounts all seem to succeed eventually. I did however notice that multipathd
seems to be running on your nodes and I'm wondering if you might experiencing an issue similar to this one. Is this something you'd be able to check?
Thanks @bfbachmann, I'll check on the link of the similar issue to see if there is anything that stands out from our end of things. As for multipath, our multipath config is at /etc/multipath.conf on the node and would have the following contents for Kubernetes or RKE:
defaults {
user_friendly_names yes
find_multipaths yes
}
While at it, as you can see in the logs, during the k8s upgrade to 1.24, the older path was unpublished and then re-published with the new path. I believe the CO does that. These actions don't seem to happen while upgrading RKE, and the older path stays as the StagingPath. I'll check to see if there are any hints from the multipath aspect.
Hi @bandak2, did you find anything interesting on the multipath front?
RKE version v1.3.18
Docker version 20.10.17-ce
Operating system and kernel SUSE Linux Enterprise Server 15 SP4 -- Kernel 5.14.21-150400.22-default
Type/provider of hosts VMware VM
cluster.yml file:
Steps to Reproduce:
Results:
SURE-6124