porcupie / rbd-docker-plugin

Ceph RBD docker volume driver plugin.
MIT License
9 stars 13 forks source link

rbd-docker-plugin shouldn't unmount and remove lock when device have lock exclusive #5

Closed CBR09 closed 7 months ago

CBR09 commented 8 years ago

Hi @porcupie and everyone. I created container with rbd test image, rbd-docker-plugin also created lock exclusive for test image, when I tried to create container with same rbd test image, docker told me "Unable to get Exlusive Lock", I've tried again, docker told me "Unable to mount device.", everything seem fine but I saw rbd-docker-plugin also unmount and remove lock exclusive. Is it a bug?

First, create container with rbd test image, check lock list and mount point

root@ceph:~# docker run -it --volume-driver=rbd --volume test:/test ceph/base bash root@ceph:~# rbd lock list test There is 1 exclusive lock on this image. Locker ID Address
client.14507 ceph 10.76.0.79:0/2029373395 root@ceph:~# cat /proc/self/mountinfo | grep rbd 56 22 251:0 / /var/lib/docker-volumes/rbd/rbd/test rw,relatime - xfs /dev/rbd4 rw,attr2,inode64,sunit=8192,swidth=8192,noquota

Switch terminal and tried to create container with same rbd test image two time, I got different logs, so confuse

root@ceph:~# docker run -it --volume-driver=rbd --volume test:/test ceph/base bash docker: Error response from daemon: VolumeDriver.Mount: Unable to get Exlusive Lock. root@ceph:~# docker run -it --volume-driver=rbd --volume test:/test ceph/base bash docker: Error response from daemon: VolumeDriver.Mount: Unable to mount device.

And I didn't see lock exclusive anymore

root@ceph:~# rbd lock list test root@ceph:~# cat /proc/self/mountinfo | grep rbd root@ceph:~#

rbd-docker-plugin log:

rbd-volume-plugin: 2016/06/06 14:33:02 driver.go:722: DEBUG: parseImagePoolNameSize: "test": ["test" "" "" "test" "" ""] rbd-volume-plugin: 2016/06/06 14:33:02 driver.go:530: INFO: API Path request(test) => /var/lib/docker-volumes/rbd/rbd/test rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:348: INFO: API Mount(test) rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:722: DEBUG: parseImagePoolNameSize: "test": ["test" "" "" "test" "" ""] rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "info" "test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "lock" "add" "test" "ceph"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "map" "test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/sbin/blkid" ["blkid" "-o" "value" "-s" "TYPE" "/dev/rbd/rbd/test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/bin/mount" ["mount" "-t" "xfs" "/dev/rbd/rbd/test" "/var/lib/docker-volumes/rbd/rbd/test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:429: ERROR: mounting device(/dev/rbd/rbd/test) to directory(/var/lib/docker-volumes/rbd/rbd/test): exit status 32 rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:973: INFO: unlockImage(rbd/test, ceph) rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "lock" "list" "test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:994: DEBUG: found lines matching ceph: [client.14527 ceph 10.76.0.79:0/2929348065] rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "lock" "rm" "test" "ceph" "client.14527"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "unmap" "/dev/rbd/rbd/test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:549: INFO: API Unmount(test) rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:722: DEBUG: parseImagePoolNameSize: "test": ["test" "" "" "test" "" ""] rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:577: WARN: Volume is not in known mounts: will attempt limited Unmount: rbd/test rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/bin/umount" ["umount" "/dev/rbd/rbd/test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:592: ERROR: unmounting device(/dev/rbd/rbd/test): exit status 1 rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "unmap" "/dev/rbd/rbd/test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:600: ERROR: unmapping image device(/dev/rbd/rbd/test): exit status 16 rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:973: INFO: unlockImage(rbd/test, ceph) rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:1158: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "admin" "lock" "list" "test"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} rbd-volume-plugin: 2016/06/06 14:33:03 driver.go:986: ERROR: image not locked or ceph rbd error: %!s()

I think rbd-docker-plugin should check device have a lock exclusive or when got "locking RBD Image(test): exit status 16" error, rbd-docker-plugin should be stop and notify to docker daemon.

Thanks

porcupie commented 8 years ago

OK - will take a look. I have not yet had this experience and have always gotten the (misspelled) "Unable to get Exlusive Lock" error. Code does try some cleanups after certain operations, perhaps it is being too aggressive in cleaning up mount/lock after a failed operation, or there could be other issues or a regression.

porcupie commented 8 years ago

@CBR09 Thanks for report, I was able to reproduce - looking into why it is being so aggressive on cleanup and to a possible fix.

porcupie commented 8 years ago

It appears that the docker daemon (or possibly volume helper lib) is calling API Unmount after a failed Mount. I'm not sure when this change occurred, but I will have to change the way Unmount operates, since I always assume you really want to unlock and unmount when you call that method.

CBR09 commented 8 years ago

Yes, I think you maybe check at Unmount function: if rbd image have lock execusive, do nothing. I really want if rbd image was used by a container, no one can use that image again.

porcupie commented 8 years ago

@CBR09 unfortunately it is not that simple.

The Unmount api method is being called - this means that docker engine wants me to unmount, unmap and unlock the image in question. This is called even when the lock is exclusive, as it always is the case.

I'm not sure how to tell this case apart from the legitimate case of calling Unmount, so will have to do more research into a possible fix. I think the reason I have never run into this is 1) I never ran the test you provided (three containers on same node trying to use same device) since 2) I run in a cluster with hundreds of nodes and rarely is job spin up on same node right away.

CBR09 commented 8 years ago

@porcupie. Yes, thanks for all you done. I've tried run multiple containers use same device (same or different node) to check lock mechanism. I want to safe when use rbd-docker-plugin in my environment.

When I ran containerA and containerB use same device at same node, I notice although rbd-docker-plugin unmount and unlock image, but containerA seem to work as normal (read/write device). I not sure how issue affect this situation?

Not-related question: Have you use rbd-docker-plugin in your production?, I looing for a suitable plugin for my env.

Thanks

porcupie commented 8 years ago

Looks like Docker 1.10 has added an ID parameter to Mount and Unmount API calls, so I should be able to use those to keep track of mapping mount and unmount per rbd image and per-client usage (?)

porcupie commented 8 years ago

Not-related question: Have you use rbd-docker-plugin in your production?, I looing for a suitable plugin for my env.

@CBR09 We are using this plugin for some Proof of Concept workloads (small low-use databases, Dockins Jenkins servers that need to save storage config to disk, etc) but nothing that could be considered production scale.

We do always explicitly stop the singleton containers that use RBD before spinning them up again, that is simply part of the Operations playbook for those services. The locking is kept as an emergency fallback just in case, but we try not to rely on this. We had issue previously with go-ceph implementation of locking via the C api, so we have always been a bit wary of this feature.

CBR09 commented 8 years ago

Looks like Docker 1.10 has added an ID parameter to Mount and Unmount API calls, so I should be able to use those to keep track of mapping mount and unmount per rbd image.

Glad to hear that. I hope you can implement and release it!

We are using this plugin for some Proof of Concept workloads (small low-use databases, Dockins Jenkins servers that need to save storage config to disk, etc) but nothing that could be considered production scale.

We do always explicitly stop the singleton containers that use RBD before spinning them up again, that is simply part of the Operations playbook for those services.

Thanks for your sharing.

porcupie commented 8 years ago

@CBR09 Hello - made a small check to see if rbd unmap returning 'exit status 16' (tested on Centos 7.2 with docker 1.9 and docker 1.11) and if so to bail early in the duplicated Unmount calls. Perhaps this can be a workaround for now?

Please try v1.2.2 and let me know if this fixes the bug on your side. Thanks

CBR09 commented 8 years ago

@porcupie : Thanks for your release, I will check it as soon as possible.

CBR09 commented 8 years ago

@porcupie Sorry for my late, I tried it today and it work Thanks you so much.

CBR09 commented 8 years ago

Hi @porcupie Today, I checked rbd-docker-plugin again, and still have some issue: umount still happen

root@ceph:~# rbd showmapped id pool image snap device
3 docker foo - /dev/rbd3

root@ceph:~# cat /proc/self/mountinfo | grep rbd 57 22 249:0 / /var/lib/docker-volumes/rbd/docker/foo rw,relatime - xfs /dev/rbd3 rw,attr2,inode64,sunit=8192,swidth=8192,noquota root@ceph:~# docker run -it --volume-driver=rbd --volume foo:/mnt/foo ceph/base bash docker: Error response from daemon: VolumeDriver.Mount: Unable to get Exclusive Lock. root@ceph:~# rbd showmapped id pool image snap device
3 docker foo - /dev/rbd3 root@ceph:~# cat /proc/self/mountinfo | grep rbd root@ceph:~#

Logs

2016/06/16 09:59:06 driver.go:349: INFO: API Mount({foo map[]}) 2016/06/16 09:59:06 driver.go:735: DEBUG: parseImagePoolNameSize: "foo": ["foo" "" "" "foo" "" ""] 2016/06/16 09:59:06 driver.go:1171: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "docker" "--conf" "/etc/ceph/ceph.conf" "--id" "docker" "info" "foo"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} 2016/06/16 09:59:06 driver.go:1171: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--pool" "docker" "--conf" "/etc/ceph/ceph.conf" "--id" "docker" "lock" "add" "foo" "ceph"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} 2016/06/16 09:59:06 driver.go:394: ERROR: locking RBD Image(foo): exit status 16 2016/06/16 09:59:07 driver.go:555: INFO: API Unmount({foo map[]}) 2016/06/16 09:59:07 driver.go:735: DEBUG: parseImagePoolNameSize: "foo": ["foo" "" "" "foo" "" ""] 2016/06/16 09:59:07 driver.go:1171: DEBUG: sh CMD: &{"/bin/umount" ["umount" "/dev/rbd/docker/foo"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} 2016/06/16 09:59:07 driver.go:1171: DEBUG: sh CMD: &{"/usr/bin/rbd" ["rbd" "--conf" "/etc/ceph/ceph.conf" "--id" "docker" "unmap" "/dev/rbd/docker/foo"] [] "" [] %!q(_syscall.SysProcAttr=) %!q(_os.Process=) "" %!q(bool=false) [] [] [] [] %!q(chan error=)} 2016/06/16 09:59:07 driver.go:607: ERROR: unmapping image device(/dev/rbd/docker/foo): exit status 16 2016/06/16 09:59:07 driver.go:611: WARN: unmap failed due to busy device, early exit from this Unmount request.

porcupie commented 8 years ago

Yes, as the CHANGELOG.md and the code itself mentions, this is unavoidable situation at the moment.

The system umount will return success even if the device is still busy. The only way we were able to hack this workaround is by checking error message from rbd unmap, which must happen after the unmount. We are not able to necessarily re-mount to "fix" since we lack the necessary information at the time. In addition, in my limited tests, the container was still fine and able to read and write to the mounted disk inside the container as seen also with df from inside container.

Will leave issue open in case we find a better fix.

CBR09 commented 8 years ago

Yes, thanks @porcupie , I will leave this open until have a better solution.