rancher / convoy

A Docker volume plugin, managing persistent container volumes.
Apache License 2.0
1.31k stars 135 forks source link

restart of convoy daemon causes error after restart #42

Open jmccormick2001 opened 8 years ago

jmccormick2001 commented 8 years ago

I'm new to using convoy so this might be user error but here is what I'm seeing:

after a new install, I can a) create a volume b) run the daemon just fine c) see the created volume d) attach the volume to a container and work with the volume

ok, works as expected!

but when I ctrl-C the daemon and restart convoy daemon, I am seeing the following behaviour:

a) I can list the volume b) when I try to run a container and map it to the previously created volume I get the following error: [root@localhost convoy]# docker run -it -v pgvol1:/pgdata --volume-driver convoy crunchydata/cpm-node:latest bash Error response from daemon: Error running CreateDevice dm_task_run failed c) the daemon log shows this error: DEBU[0073] [devmapper] CreateDevice(poolName=/dev/mapper/convoy-pool, deviceId=9) DEBU[0073] libdevmapper(7): ioctl/libdm-iface.c:1750 (4) dm version OF 16384 DEBU[0073] libdevmapper(7): ioctl/libdm-iface.c:1750 (4) dm message convoy-pool OF create_thin 9 16384 ERRO[0073] libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: message ioctl on convoy-pool failed: Operation not permitted DEBU[0073] Response: { "Err": "Error running CreateDevice dm_task_run failed" } pkg=daemon

I'm running both the daemon and the docker 'run' as the root user. The platform is centos 7 using docker 1.8.3 with selinux in enforcing mode. This is convoy version 0.3.

any ideas?

yasker commented 8 years ago

Hi @jmccormick2001

Sounds like a bug to me. Could you show the full convoy log?

Also could you show the result of convoy list?

jmccormick2001 commented 8 years ago

The daemon log is here:

https://gist.github.com/jmccormick2001/33be6d2a77e4f882f416

The docker 'run' log is here:

https://gist.github.com/jmccormick2001/07c9d38bbe3d2f76a56c

I really like the product and the user interface in particular, great work, looking forward to integrating this into my postgresql/docker project as soon as possible:

https://github.com/CrunchyData/crunchy-postgresql-manager

yasker commented 8 years ago

Hi @jmccormick2001

It's pretty rare seeing this issue...

There are several possibility may cause this issue:

  1. The user running Convoy daemon don't have permission to operate on /dev/mapper/convoy-pool. Means, it's not the root user or have root privilege to access the device. SELinux may played a role here..
  2. There is already a device in the pool with the dev_id Convoy about to create. In the above case, it's dm_volume_devid=11.

In your convoy list, I didn't see any volume with devid 11. Though there is a more directly way to verify it. Could you run convoy-pdata_tools thin_dump /dev/vdb2? It should show all the information regarding the convoy-pool.

I also notice you were able to create volume bfc4c2e2-ac45-4906-bdbc-7b56102b6dc4, which with devid 10. The first log you shown here is create volume with devid 9 failed. So is this volume created after the failure you shown in the first log?

jmccormick2001 commented 8 years ago

here is the output from the thin_dump command:

https://gist.github.com/jmccormick2001/5ca19a16d0a23b18bcff

I have set the selinux to Permissive, and am running the daemon and convoy commands as the root user.

I can see the volume after running 'convoy list', but I get this error when I try to remove the volume:

DEBU[0484] Calling: DELETE, /volumes/, request: DELETE, /v1/volumes/ pkg=daemon DEBU[0484] event=delete object=volume pkg=daemon reason=prepare volume=4396d8d5-9ac2-4e3e-8c01-d38f0ff4481f DEBU[0484] Error open 4396d8d5-9ac2-4e3e-8c01-d38f0ff4481f: no such file or directory when discarding 4396d8d5-9ac2-4e3e-8c01-d38f0ff4481f, ignored pkg=devmapper DEBU[0484] libdevmapper(7): ioctl/libdm-iface.c:1750 (4) dm remove 4396d8d5-9ac2-4e3e-8c01-d38f0ff4481f OF 16384 ERRO[0484] libdevmapper(3): ioctl/libdm-iface.c:1768 (-1) device-mapper: remove ioctl on 4396d8d5-9ac2-4e3e-8c01-d38f0ff4481f failed: No such device or address ERRO[0484] Handler for DELETE /volumes/ returned error: Error running RemoveDevice dm_task_run failed pkg=daemon

jmccormick2001 commented 8 years ago

It could be related to my environment, this is a KVM centos 7 vm with virtual disk attached, I ran the convoy partition utility on that virtual disk to set it up. But is interesting that it worked the first time just as expected, letting me create the volume, work with it from another container, kill off that container, start up another container that was able to use that same volume. It only quit working after doing a ctrl-C on the daemon and restarting it.

yasker commented 8 years ago

@jmccormick2001 Ctrl-C should be fine since that's what I always do.

I am trying to reproduce this issue.

Could you show ls /dev/mapper, as well as dmsetup table?

I found in Convoy log, Docker request another volume than the one you specified in the path: 67a95be1bc2935a79378e99a6b0ced948d261003b76f9824cb292cc1f806f9be, is it because crunchydata/cpm-node required a volume for certain path?

jmccormick2001 commented 8 years ago

ah, figured it out with your last comments! so, the Docker container I was testing with had 3 VOLUMEs specified, I was only supplying a single of them when I was testing. The initial test I did must have been with another container? not sure. But I verified this morning that when I supply all the VOLUMEs it will work.

I suspect this type of user error could happen often, not sure if an error message is possible to alert the user that they are missing required VOLUMEs? or this might be a Docker concern?

again, great project, I look forward to testing it more and hopefully adding it to my project as its exactly what I've been looking for.

yasker commented 8 years ago

Thanks @jmccormick2001 . It's not right to expect user know so much about the volume. Docker and Convoy should handle it right. I would investigate more about what's happened.