Open connde opened 4 years ago
I just tested this on DOKS by directly deleting the droplet hosting a PVC-using pod (managed by a StatefulSet). After the node removal was detected (by our cloud-controller-manager component), the Node object was removed and the workload transferred to a different node, along with the PVC.
To clarify: did you delete the droplet or just the Node object in the cluster?
Hi @timoreimann , I'm NOT using DOKS, using RancherOS and deploying the nodes to droplets.
I deleted the node from Rancher UI, it got deleted and created correctly as expected but the PVC did not get attached.
@connde thanks. Understood you're not on DOKS -- the behavior should be identical though: as soon as the control plane detects that a node is gone, the workload should be moved elsewhere, including volumes.
To troubleshoot this further, we'll need the logs from your Controller and Node services. Could you share those?
@timoreimann I have cluster dump, is it enough?
That's perfect, thank you @connde. I'll need a bit of time to work through it, will report back once I'm done.
It's ok, no problem if it takes some time.
I don't expect that the node will restart, I was just testing what would happen if a node failed.
On Fri, 10 Jul 2020 at 13:39, Timo Reimann notifications@github.com wrote:
That's perfect, thank you @connde https://github.com/connde. I'll need a bit of time to work through it, will report back once I'm done.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/digitalocean/csi-digitalocean/issues/334#issuecomment-656771062, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALRYPZQ45J6D24YPLCQ72TR24733ANCNFSM4OWFMC2Q .
It's ok, no problem if it takes some time. I don't expect that the node will restart, I was just testing what would happen if a node failed. … On Fri, 10 Jul 2020 at 13:39, Timo Reimann @.***> wrote: That's perfect, thank you @connde https://github.com/connde. I'll need a bit of time to work through it, will report back once I'm done. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#334 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALRYPZQ45J6D24YPLCQ72TR24733ANCNFSM4OWFMC2Q .
The issue also occurs when updating the cluster, it's really frustrating that it takes a super long time for Kubernetes to know that a volume is no longer attached.
As anything changed on this issue?
@dlebee do you experience the issue when you upgrade using DOKS or a self-hosted Kubernetes?
@dlebee do you experience the issue when you upgrade using DOKS or a self-hosted Kubernetes?
DOKS, I have multiple clusters and it always occurs, before the pvc stays attached to a old node and I have to manually go unmount the volume and wait quite some time each upgrade of k8s, that takes the systems down.
@dlebee that's definitely not expected. What kind of workload do you use to reference the PVCs? Is it StatefulSets?
Regular Deployments bear the risk of getting to a situation where two replicas are trying to come up, which cannot work when volumes are associated. Just double-checking this isn't the case for you here.
@dlebee that's definitely not expected. What kind of workload do you use to reference the PVCs? Is it StatefulSets?
Regular Deployments bear the risk of getting to a situation where two replicas are trying to come up, which cannot work when volumes are associated. Just double-checking this isn't the case for you here.
Yes they are statefulset like mongodb/mariadb created by helm charts.
@dlebee got it. Is it a particular set of DOKS/Kubernetes versions where you saw this happening, or across the board? How old was the oldest version?
@dlebee got it. Is it a particular set of DOKS/Kubernetes versions where you saw this happening, or across the board? How old was the oldest version?
Not really, I've had this issue migrating, and via the web you can only go up one version at a time, and it happened every single time I upgraded a version.
I can give you the details of the k8s, maybe upgrading the kubernetes does not update the CSI driver?
davidlebee@Davids-MacBook-Pro ~ % kubectl get CSIDriver -o yaml
apiVersion: v1
items:
- apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"storage.k8s.io/v1beta1","kind":"CSIDriver","metadata":{"annotations":{},"name":"dobs.csi.digitalocean.com"},"spec":{"attachRequired":true,"podInfoOnMount":true}}
creationTimestamp: "2021-02-19T16:26:43Z"
name: dobs.csi.digitalocean.com
resourceVersion: "316"
uid: 9754dccf-b1df-4986-ad6c-a63c228261f8
spec:
attachRequired: true
fsGroupPolicy: ReadWriteOnceWithFSType
podInfoOnMount: true
volumeLifecycleModes:
- Persistent
kind: List
metadata:
resourceVersion: ""
selfLink: ""
@dlebee the CSI components should definitely get upgraded as well. It might be more of an issue of how the upgrade proceeds in your case.
I'll run some extra tests. Appreciate any additional details you may be able to provide (either here, via mail, or on the Kubernetes Slack).
@dlebee the CSI components should definitely get upgraded as well. It might be more of an issue of how the upgrade proceeds in your case.
I'll run some extra tests. Appreciate any additional details you may be able to provide (either here, via mail, or on the Kubernetes Slack).
I have to update a cluster soon, it is currently running 16.16.6-do-2, I’ll let you know how it went.
if you have any questions I’ll be following thread
@dlebee the CSI components should definitely get upgraded as well. It might be more of an issue of how the upgrade proceeds in your case.
I'll run some extra tests. Appreciate any additional details you may be able to provide (either here, via mail, or on the Kubernetes Slack).
I have upgraded a cluster today and did not have the same issue, is the CSI driver updated automatically on upgrades or is it a manual operation that needs to be done if the cluster is older?
@dlebee all components are always upgraded automatically, including the CSI driver. You don't have to upgrade or install any of the managed components yourself.
What's worth pointing out is that older CSI driver and Kubernetes versions still contained certain bugs that got addressed in more recent versions. Chances are you are now past the point where those affect you.
I’ll upgrade that cluster specifically and let you know as soon as I can if it the issue is still present.
@timoreimann So the older cluster had pods stuck on terminating and not moving pods stuck in terminating for a long time.
I think the reason is that cluster is actually weaker in resources so it takes longer and I got impatient so I terminated the pods with --force which probably did not alert the CSI driver that the PVC that its no longer bound.
Is there a way to tell k8s faster to release a PVC when a pod is killed by force?
Also another pvc was not reattached during the upgrade that I did not force close, once I released the volume on the website it attached but I had to go release it on the website.
Thank you, David.
@dlebee if I had to guess, I wouldn't think that it's a resource problem: bringing pods down should happen fairly quickly. How long was "long time" for you?
If anything, --force
should speed up the detachment process: the CSI driver cannot detach volumes if pods using it are still up (including the terminating state). By removing the pod, the CSI-/volume-related controllers should notice that the volume user has gone away and move forward with detaching.
What would be ideal to have if this happens again is all the events that occurred (kubectl -n <involved namespace> get events
), the current node state (kubectl get nodes -o yaml
), the involved PVCs / PVs (kubectl -n <involved namespace> get pvc -o yaml
/ kubectl get pv -o yaml
), and the current volume attachments (kubectl get volumeattachment -o yaml
); all at the time the pods are stuck.
What did you do? (required. The issue will be closed when not provided.)
Deleted a random node in my Rancher cluster to see how Percona Xtradb cluster behaved
What did you expect to happen?
Node to be recreated and Percona node attached to existing PVC
Configuration (MUST fill this out):
Please provide the following logs:
This will output everthing from your cluster. Please use a private gist via https://gist.github.com/ to share this dump with us Not able to create a gist, is generating an error on the site but happy to send to an email if needed.
Please provide the total set of manifests that are needed to reproduce the issue. Just providing the
pvc
is not helpful. If you cannot provide it due privacy concerns, please try creating a reproducible case.CSI Version: https://github.com/digitalocean/csi-digitalocean/tree/master/deploy/kubernetes/releases/csi-digitalocean-latest
Kubernetes Version: 1.18.3
Cloud provider/framework version, if applicable (such as Rancher): RancherOS 2.4.5 -> DigitalOcean -> 3 nodes Not using DOKS.
Normal Scheduled 76s default-scheduler Successfully assigned my-percona-xtradb-cluster-operator/cluster-01-pxc-2 to worker-pool2 Normal SuccessfulAttachVolume 76s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-e801f45f-3ac1-4d5e-8ce5-dc2a79191992" Warning FailedMount 28s (x7 over 60s) kubelet, worker-pool2 MountVolume.MountDevice failed for volume "pvc-e801f45f-3ac1-4d5e-8ce5-dc2a79191992" : rpc error: code = Internal desc = formatting disk failed: exit status 1 cmd: 'mkfs.ext4 -F /dev/disk/by-id/scsi-0DO_Volume_pvc-e801f45f-3ac1-4d5e-8ce5-dc2a79191992' output: "mke2fs 1.45.5 (07-Jan-2020)\nThe file /dev/disk/by-id/scsi-0DO_Volume_pvc-e801f45f-3ac1-4d5e-8ce5-dc2a79191992 does not exist and no size was specified.\n"
Hi, to reproduce create a Percona operator than a CR with 3 nodes, after cluster is running delete a node manually and wait for recreation. Volume will not bind correctly.
If I manually attach the volume in DO dashboard and terminate the pod the new pod gets created correctly.
Any help is appreciated.