Closed echa closed 6 years ago
Hi, I have a similar problem (logical volumes are not mounted when their corresponding docker volumes are used) but it happens even after a clean reboot.
As echa I then need to manually reset the counts in /var/lib/docker-lvm-plugin/lvmCountConfig.json
for each docker volume and to restart docker-lvm-plugin and docker services (sudo systemctl restart docker-lvm-plugin
and sudo systemctl restart docker
).
Otherwise I have to manually mount the LV to the path specified in the docker volume. Ex for a docker volume named my-lv :
docker volume inspect my-lv
:
[
{
"Name": "my-lv",
"Driver": "lvm",
"Mountpoint": "/var/lib/docker-lvm-plugin/my-lv",
"Labels": {},
"Scope": "local"
}
]
sudo lvdisplay
:
--- Logical volume ---
LV Path /dev/docker-vg/my-lv
LV Name my-lv
...
After reboot the docker volume is not "linked" to the LV anymore (because LV is not mounted at the path specified by the docker volume) but it becomes a host directory mounted as a docker volume : data is stored on host in the dir at path /var/lib/docker-lvm-plugin/my-lv
.
I need to manually mount the LV at this path to "link" back the LV to the docker volume :
sudo mount /dev/docker-vg/my-lv /var/lib/docker-lvm-plugin/my-lv
From the info I found the purpose of the count property of lvmDriver
(that is saved to disk for each docker volume/LV in this lvmCountConfig.json
file) is to keep count of how many docker containers use this volume, so a docker volume cannot be not removed when at least 1 container is using it (as explained here https://github.com/docker/docker/issues/17585).
Docker requires the plugin to provide a volume, given a user specified volume name. This is called once per container start. If the same volume_name is requested more than once, the plugin may need to keep track of each new mount request and provision at the first mount request and deprovision at the last corresponding unmount request.
source : https://docs.docker.com/engine/extend/plugins_volume/
Volumes are removed explicitly (i.e., docker volume rm) or implicitly via container remove (i.e., docker rm -v). Docker will only send a remove request to a volume driver when the internal reference count drops to zero. Since Docker’s reference counting is not multi-host aware, the volume driver must be.
source : http://www.blockbridge.com/multi-host-volumes-semantics-with-docker-1-9/
But I don't see how it relates to this mounting problem. Plus I did not really get this :
these are files for persisting the state of the memory stores on disk
source : https://github.com/shishir-a412ed/docker-lvm-plugin/issues/15
@echa @monkeydri There are multiple ways to solve this. Right now when the system reboots (clean restart or server crash) the logical volumes are not auto mounted after the restart. Based on the code this is expected. We can do the following to handle this:
1) If the user has executed systemctl enable docker-lvm-plugin
, the plugin would restart automatically on reboot and load the state of the volumes from the config files {lvmCountConfig.json, lvmVolumesConfig.json
} back in the memory. Based on those states, the plugin would then mount the lv's back (if they are not mounted).
2) Create entries in /etc/fstab
so that the mount points are persisted across reboots.
3) Drop mount files for each volume in /etc/systemd/system
and let systemd
take care of the mount state. I don't like (3) since it would result in having too many mount files if there are a lot of LVM volumes.
I will discuss this with my tech lead and see how we can resolve this in the most efficient way possible.
Shishir
@echa @monkeydri
I had a discussion with my technical lead, and we think option {1} would be best to solve this. We will let the daemon {docker-lvm-plugin} handle the state of the mounts on reboot.
I will create a PR to add this functionality.
/cc @rhatdan
Shishir
We are experiencing the same issue. After server restart, the docker-lvm-plugin volumes do not get mounted correctly and therefore do not contain any data.
Is there any news on this issue ? Has any solution already been implemented or are there known workarounds ? Or is it better to just NOT use this plugin ?
My workaround has been and still is to reset counters in lvmCountConfig.json
to zero as extra step in docker-lvm-plugin's systemd launch script. I keep the last good version of the config file (with all volumes registered and counters at zero) around for copy. A smarter solution would be to have a script parse JSON and set counts to zero so you don't miss changes to the list of volumes.
Thanks for that suggestion. We use Rancher for managing containers and volumes so it creates volumes for us depending on which containers we install. So the mounted volumes may indeed change over time. Your suggestion for parsing the JSON on startup might work in that scenario.
We also notice though that in that lvmCountConfig.json file more volumes are listed then still in use. When containers are deleted, the volumes are not correctly cleaned up apparently. But that is probably another issue.
@echa @maartenl945 I will try to take a look at this sometime this week.
When containers are deleted, the volumes are not correctly cleaned up apparently. But that is probably another issue.
AFAIK, when a container is deleted, the volume associated with that container is not automatically deleted, unless you do a docker rm -v
.
root@shishir-All-Series:~# docker rm --help
Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]
Remove one or more containers
Options:
-f, --force Force the removal of a running container (uses SIGKILL)
--help Print usage
-l, --link Remove the specified link
-v, --volumes Remove the volumes associated with the container
root@shishir-All-Series:~#
Thanks @shishir-a412ed
Related to the cleaning up of volumes: Rancher automatically cleans up volumes when a stack is deleted. We also see the related directories under /var/lib/docker-lvm-plugin disappearing when the volumes are cleaned up. However the volume administration in the json files in that directory is not correctly updated. It still lists some of the cleaned up volumes in there. Even with a usage count (if that is what is it) of 1.
@maartenl945
However the volume administration in the json files in that directory is not correctly updated. It still lists some of the cleaned up volumes in there. Even with a usage count (if that is what is it) of 1.
This sounds like a different bug. Ideally this should not happen, and the counts should be updated correctly in the config file, when the container (and it's associated volumes) are removed.
Let me first fix @echa original issue. Once that is fixed, we can try to see if your issue is still happening.
Shishir
Yes I agree, it sounds like a different issue.
As a temporary workaround we are now mounting the lvm volumes ourselves during startup of the server. For that to work properly, we have to wait until the LVM volumes exist and then do the mount of the directories in /var/lib/docker-lvm-plugin.
The method to reset the counts in lvmCountConfig.json did not seem to work reliably when the container stack was brought down before system restart.
Unsure how our temporary workaround affects the docker-lvm-plugin and its admin though.
Hi @shishir-a412ed, do you have any idea when you might have time to take a look at this ? It seems like you cannot really use this plugin if volumes are not correctly mounted after a reboot. Regards, Maarten
@maartenl945 I will try to take a look at this issue, this weekend. Busy with some other things, sorry for the delay.
@shishir-a412ed No need to apologize, just wanted to see when you might have a look at this. Regards, Maarten
Here's a quick workaround to reset all counters on module restart. It reads the currently configured list of volumes from /var/lib/docker-lvm-plugin/lvmVolumesConfig.json
and overwrites /var/lib/docker-lvm-plugin/lvmCountConfig.json
with all volume counters set to zero.
You need to have jq installed and add a new systemd file under /etc/systemd/system/docker-lvm-plugin.service.d/reset-counters.conf
Remember that this is not a complete solution because neither this little hack nor docker-lvm-plugin check the actual mount status of your filesystems. Should an LVM volume already be mounted AND the counter is zero, docker-lvm-plugin will blindly trust the counter and try mounting again, which results in a mount error and the container will not be started.
This workaround works, however after a server restart.
[Service]
ExecStartPre=/bin/sh -c 'cat /var/lib/docker-lvm-plugin/lvmVolumesConfig.json | jq -c "map_values(.=0)" > /var/lib/docker-lvm-plugin/lvmCountConfig.json'
Thanks for the (workaround) solution @echa !
@maartenl945 I checked this issue, and you are right, lvmCountConfig.json
still show volume count as 1 after the reboot. The correct value should be 0.
The issue is: Let's take an example scenario:
1) You have a volume named foobar
2) You have 3 running containers, c1, c2 and c3. Each of them has foobar
mounted at /run
.
3) You reboot the system. The containers are going to exit, and will call the plugin to unmount the volume.
4) The plugin is expected to only umount
for the last container. For the rest, it will just do a count--
, and update/save to disk lvmCountConfig.json
. The reasoning behind this is the LVM device is only mounted for the first container to /var/lib/docker-lvm-plugin/foobar. For the rest of the containers it's just a bind mount of this location. So umount
only needs to happen for the last container.
5) Issue: During reboot, when plugin tries to umount
on the last container, it fails because systemd already unmounted the device as part of reboot.
Jun 14 19:03:42 localhost.localdomain docker-lvm-plugin[1125]: Unmount: unmount error: exit status 32 output umount: /var/lib/docker-lvm-plugin/foobar: not mounted
So it never gets to update the count to 0.
I tried to fix this here: https://github.com/projectatomic/docker-lvm-plugin/compare/master...shishir-a412ed:auto_mount_issue
By making docker-lvm-plugin to only umount
if the device is already mounted.
If systemd has unmounted it, we just update the count to 0.
But apparently, even the mount command (sh -c "mount|grep /var/lib/docker-lvm-plugin/foobar") to check if the device is still mounted OR already unmounted, is failing during reboot. Some race condition with PID-1 probably.
Jun 24 15:54:07 localhost.localdomain docker-lvm-plugin[30139]: Unmount: Error checking if volume /var/lib/docker-lvm-plugin/foobar is mounted
I guess the best solution for now, would be to just use @echa 's workaround.
Basically after a restart, the counts in lvmCountConfig.json
needs to be set to 0.
@shishir-a412ed Thanks for taking a look. Can the plug-in service not just set the counts to 0 on service startup ? Effectively implementing the workaround in the plug-in.
@maartenl945
Can the plug-in service not just set the counts to 0 on service startup ?
It can, but that won't fix the issue.
The counts are indicative of, how many containers {c1, c2 and c3} this volume {foobar} is mounted to.
e.g. in the above scenario. foobar
will have a count 3
. If the docker daemon is still running those containers, and we just restart the plugin and reset the count to 0, that would be wrong.
@maartenl945 There was a small issue in the PR. I have fixed it now, and it's working for me. After the reboot, the counts are reset to 0.
https://github.com/projectatomic/docker-lvm-plugin/pull/53 Can you try it once, and let me know if it works for you too ?
@shishir-a412ed Unfortunately I don’t have a ‘go’ development environment, nor easy access to our target at the moment since I’m not at work for the next couple of weeks. I’ll see what I can do but don’t wait for me since I’m sure other people will be helped by this solution too! Thks, Maarten
@maartenl945 No worries. We have merged the PR to master.
Shishir
Hey guys, I just experienced a server crash and after reboot my existing lvm volumes never got auto-mounted again. Looking at the code this is to be expected.
Reason is that you trust information in
/var/lib/docker-lvm-plugin/lvmCountConfig.json
even after an unclean shutdown/crash. Counts were still set to 1 in my case since this was the last alive state (containers running, docker running, volumes mounted by kernel and exposed to containers).After manually resetting all counters to 0 and restarting docker + docker-lvm-plugin services the auto-mount worked as expected.
System was a CentOS 7.2 + EPEL:
docker version
Client: Version: 1.10.3 API version: 1.22 Package version: docker-common-1.10.3-46.el7.centos.14.x86_64 Go version: go1.6.3 Git commit: cb079f6-unsupported Built: Fri Sep 16 13:24:25 2016 OS/Arch: linux/amd64
Server: Version: 1.10.3 API version: 1.22 Package version: docker-common-1.10.3-46.el7.centos.14.x86_64 Go version: go1.6.3 Git commit: cb079f6-unsupported Built: Fri Sep 16 13:24:25 2016 OS/Arch: linux/amd64
yum info docker-lvm-plugin
Installed Packages Name : docker-lvm-plugin Arch : x86_64 Version : 1.10.3 Release : 46.el7.centos.14 Size : 8.6 M Repo : installed From repo : extras Summary : Docker volume driver for lvm volumes URL : https://github.com/docker/docker License : LGPLv3 Description : Docker Volume Driver for lvm volumes.