Closed dannert closed 5 years ago
I have also seen this same behavior in recent PoC with customer usign ICP 3.1.1 and PowerVC Flex Volume Driver.
I looked into this in more detail. I had earlier forgotten why we were not running udevadm commands too when volume is being detached.
This is a limitation that we have which is based on Flex volume driver design. The volume detachment process on flex volume driver happens as follows:
Below are relevant logs: On worker: 019-02-25T21:03:09.73-06:00 main : DEBUG : The args to main are unmount [unmount /var/lib/kubelet/pods/ed3cfe61-3972-11e9-b711-fadd2e279820/volumes/ibm~power-k8s-volume-flex/nginx-cinder-vol-1] 2019-02-25T21:03:09.73-06:00 unmount : INFO : unmount called with /var/lib/kubelet/pods/ed3cfe61-3972-11e9-b711-fadd2e279820/volumes/ibm~power-k8s-volume-flex/nginx-cinder-vol-1 2019-02-25T21:03:09.779-06:00 main : INFO : Returning response {"status":"Success","message":"Unmounted volume directory /var/lib/kubelet/pods/ed3cfe61-3972-11e9-b711-fadd2e279820/volumes/ibm~power-k8s-volume-flex/nginx-cinder-vol-1"} 2019-02-25T21:03:09.833-06:00 main : DEBUG : The args to main are unmountdevice [unmountdevice /var/lib/kubelet/plugins/kubernetes.io/flexvolume/ibm/power-k8s-volume-flex/mounts/nginx-cinder-vol-1] 2019-02-25T21:03:09.833-06:00 unmountDevice : INFO : unmountDevice called with /var/lib/kubelet/plugins/kubernetes.io/flexvolume/ibm/power-k8s-volume-flex/mounts/nginx-cinder-vol-1 2019-02-25T21:03:09.93-06:00 main : INFO : Returning response {"status":"Success","message":"Operation Success"}
After the unmount() and unmountDevice() are successful, detach() is called on controller 2019-02-26T03:03:11.991Z main : DEBUG : The args to main are detach [detach nginx-cinder-vol-1 ..
@gautpras Just thinking broadly for a moment - would it be possible to consider a periodic task of sorts that cleans these up vs. waiting for the volume event (which may never come in)?
We have written a new api to run udevadm commands periodically in main.go. The setup-power-openstack-k8s-volume-flex.sh will run this api every 24 hours for cleaning up after deleting LUN/ Persistent volume. This uses the same lock which is used by waitForAttach api to lock and run udevadm commands.
The above does not work. FVD pod does not have the view of its host's OS device map. So the cleanup code id not able to clean the multipath mappings on the host OS.
We thought of cleaning up the multipath maps by trying out two approaches.
Based on the above, it seems not possible to cleanup the multipath maps using a periodic task. The best that FVD can do is cleanup the maps on next attach volume attempt.
We are still working on this path as this requires more functionality tests. So tagging this defect for next release of FVD 1.0.2.
The above commit fixes the issue of cleaning up block devices during unmountdevice call itself. Earlier, the devices were not being cleaned because the volume was still attached. The fix still clears up the device because the the next call after unmountdevice is going to be detach from kubernetes which will eventually detach the volume. So the cleanup is eager cleanup.
While/after a FlexVolume driver LUN is unmapped / deleted from worker node the FlexVolume driver does not correctly remove mounts and multipath devices.
Expectation is that after a LUN is moved to another worker node or is deleted, that the original worker node this LUN was attached to is "clean" with entries in /dev/mapper, (multipath -l output) and any container mounts fully removed.
In my test the move of a LUN to another worker node left the device and mount in place and /var/log/messages shows these errors continuously: Jan 30 16:37:05 aop93cl124 hyperkube: E0130 16:37:05.071283 4536 kubelet_volumes.go:140] Orphaned pod "a0a236ad-23df-11e9-b738-fa99c511ef20" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them. Jan 30 16:37:05 aop93cl124 multipathd: mpathh: sdc - tur checker reports path is down Jan 30 16:37:06 aop93cl124 multipathd: mpathh: sdx - tur checker reports path is down Jan 30 16:37:07 aop93cl124 hyperkube: E0130 16:37:07.062533 4536 kubelet_volumes.go:140] Orphaned pod "a0a236ad-23df-11e9-b738-fa99c511ef20" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them. Jan 30 16:37:07 aop93cl124 multipathd: mpathh: sdo - tur checker reports path is down Jan 30 16:37:09 aop93cl124 hyperkube: E0130 16:37:09.068412 4536 kubelet_volumes.go:140] Orphaned pod "a0a236ad-23df-11e9-b738-fa99c511ef20" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them. Jan 30 16:37:09 aop93cl124 multipathd: mpathh: sdf - tur checker reports path is down Jan 30 16:37:09 aop93cl124 multipathd: mpathh: sdi - tur checker reports path is down Jan 30 16:37:09 aop93cl124 multipathd: mpathh: sdl - tur checker reports path is down Jan 30 16:37:09 aop93cl124 multipathd: mpathh: sdr - tur checker reports path is down Jan 30 16:37:09 aop93cl124 multipathd: mpathh: sdu - tur checker reports path is down Jan 30 16:37:10 aop93cl124 multipathd: mpathh: sdc - tur checker reports path is down