contiv-experimental / volplugin

**EXPERIMENTAL** Contiv Storage: Policy backed Clustered Storage (via Ceph or NFS) for Docker
Other
220 stars 29 forks source link

container launch fails to create rbd volume #470

Open jolly2 opened 7 years ago

jolly2 commented 7 years ago

Trying contiv volume plugin for ceph/rbd support for docker. Ceph/rbd works perfectly fine independent of docker. Once integrated with docker, after creating the global and tenant configurations, when a container is launched with contiv volume driver and the policy, it errors out with the message "rbd: map failed: (6) No such device or address". It looks like it is executing rbd map command without waiting for or checking if the volume got created. Detailed log from volplugin is provided below.

For some reason , it is not executing the rbd create volume command. May be it is timing out fast. I have set the Timeout option in global configuration to 10. The doc says it is in minutes. But, I do not see rbd create volume run in the background. Is there any other way to set the timeout driver option in minutes?

Using contiv volume plugin git clone with docker 1.12 with ceph jewel.

DEBU[13691] Dispatching Get with {"Name":"jaypolicy2/sk1"} DEBU[13691] Dispatching Create with {"Name":"jaypolicy2/sk1","Opts":{}} INFO[13691] Creating volume jaypolicy2/sk1 DEBU[13691] Publishing use: (error: ) &config.UseMount{Volume:"jaypolicy2/sk1", Hostname:"engine3", Reason:"Create"} DEBU[13691] Publishing use: (error: ) &config.UseSnapshot{Volume:"jaypolicy2/sk1", Reason:"Create"} DEBU[13691] Volume Create: config.Volume{PolicyName:"jaypolicy2", VolumeName:"sk1", Unlocked:false, DriverOptions:map[string]string{"pool":"rbd"}, MountSource:"", CreateOptions:config.CreateOptions{Size:"13MB", FileSystem:"ext4"}, RuntimeOptions:config.RuntimeOptions{UseSnapshots:false, Snapshot:config.SnapshotConfig{Frequency:"", Keep:0x0}, RateLimit:config.RateLimitConfig{WriteBPS:0x0, ReadBPS:0x0}}, Backends:(config.BackendDrivers)(0xc8203ae4e0)} INFO[13691] Creating volume jaypolicy2/sk1 with size 13 INFO[13691] Formatting volume jaypolicy2/sk1 (filesystem "ext4") with size 13 ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.130948ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.660876ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 27.291704ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.504335ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.496074ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.687049ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 32.665319ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 24.648046ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.007521ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 141.649472ms) (exit status 6). Retrying. INFO[13692] Destroying volume jaypolicy2/sk1 DEBU[13692] Removing Use Lock: &config.UseMount{Volume:"jaypolicy2/sk1", Hostname:"engine3", Reason:"Create"} DEBU[13692] Removing Use Lock: &config.UseSnapshot{Volume:"jaypolicy2/sk1", Reason:"Create"} ERRO[13692] Returning HTTP error handling plugin negotiation: Creating Volume github.com/contiv/volplugin/errors.init [errors.go 98] github.com/contiv/volplugin/config.init [volume.go 444] github.com/contiv/volplugin/volplugin.init [volplugin.go 122] main.init [cli.go 66] runtime.main [proc.go 177] runtime.goexit [asm_amd64.s 1998] Formatting Volume github.com/contiv/volplugin/errors.init [errors.go 96] github.com/contiv/volplugin/config.init [volume.go 444] github.com/contiv/volplugin/volplugin.init [volplugin.go 122] main.init [cli.go 66] runtime.main [proc.go 177] runtime.goexit [asm_amd64.s 1998] Could not map "jaypolicy2.sk1": Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 213.526654ms (exit status 6) (rbd: sysfs write failed rbd: map failed: (6) No such device or address ) github.com/contiv/volplugin/storage/backend/ceph.(Driver).mapImage [internals.go 44] github.com/contiv/volplugin/storage/backend/ceph.(Driver).Format [ceph.go 136] github.com/contiv/volplugin/storage/control.FormatVolume [volume.go 82] github.com/contiv/volplugin/api.(API).createVolume.func1 [handlers.go 37] github.com/contiv/volplugin/lock.(Driver).ExecuteWithMultiUseLock [lock.go 93] github.com/contiv/volplugin/api.(API).Create [handlers.go 100] github.com/contiv/volplugin/api.(API).Create-fm [docker.go 32] github.com/contiv/volplugin/api.LogHandler.func1 [api.go 101] net/http.HandlerFunc.ServeHTTP [server.go 1618] github.com/contiv/volplugin/vendor/github.com/gorilla/mux.(Router).ServeHTTP [mux.go 98] net/http.serverHandler.ServeHTTP [server.go 2081] net/http.(*conn).serve [server.go 1472] runtime.goexit [asm_amd64.s 1998]

erikh commented 7 years ago

This is failing because the /sys/bus devices don’t exist. How did you launch volplugin? This sounds like one of two things:

Review the /sys is mapped right and that your rbd version in your container matches the one on your host kernel (nothing much we can do about this, especially on redhat).

If that doesn’t work for you I will review soon, but I just saw this and need to leave.

-Erik

From: jolly2 notifications@github.com<mailto:notifications@github.com> Reply-To: contiv/volplugin reply@reply.github.com<mailto:reply@reply.github.com> Date: Friday, November 11, 2016 at 4:55 PM To: contiv/volplugin volplugin@noreply.github.com<mailto:volplugin@noreply.github.com> Subject: [contiv/volplugin] container launch fails to create rbd volume (#470)

Trying contiv volume plugin for ceph/rbd support for docker. Ceph/rbd works perfectly fine independent of docker. Once integrated with docker, after creating the global and tenant configurations, when a container is launched with contiv volume driver and the policy, it errors out with the message "rbd: map failed: (6) No such device or address". It looks like it is executing rbd map command without waiting for or checking if the volume got created. Detailed log from volplugin is provided below.

For some reason , it is not executing the rbd create volume command. May be it is timing out fast. I have set the Timeout option in global configuration to 10. The doc says it is in minutes. But, I do not see rbd create volume run in the background. Is there any other way to set the timeout driver option in minutes?

Using contiv volume plugin git clone with docker 1.12 with ceph jewel.

DEBU[13691] Dispatching Get with {"Name":"jaypolicy2/sk1"} DEBU[13691] Dispatching Create with {"Name":"jaypolicy2/sk1","Opts":{}} INFO[13691] Creating volume jaypolicy2/sk1 DEBU[13691] Publishing use: (error: ) &config.UseMount{Volume:"jaypolicy2/sk1", Hostname:"engine3", Reason:"Create"} DEBU[13691] Publishing use: (error: ) &config.UseSnapshot{Volume:"jaypolicy2/sk1", Reason:"Create"} DEBU[13691] Volume Create: config.Volume{PolicyName:"jaypolicy2", VolumeName:"sk1", Unlocked:false, DriverOptions:map[string]string{"pool":"rbd"}, MountSource:"", CreateOptions:config.CreateOptions{Size:"13MB", FileSystem:"ext4"}, RuntimeOptions:config.RuntimeOptions{UseSnapshots:false, Snapshot:config.SnapshotConfig{Frequency:"", Keep:0x0}, RateLimit:config.RateLimitConfig{WriteBPS:0x0, ReadBPS:0x0}}, Backends:(_config.BackendDrivers)(0xc8203ae4e0)} INFO[13691] Creating volume jaypolicy2/sk1 with size 13 INFO[13691] Formatting volume jaypolicy2/sk1 (filesystem "ext4") with size 13 ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.130948ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.660876ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 27.291704ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.504335ms) (exit status 6). Retrying. ERRO[13691] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 25.496074ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.687049ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 32.665319ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 24.648046ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 26.007521ms) (exit status 6). Retrying. ERRO[13692] Error mapping image: jaypolicy2.sk1 (Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 141.649472ms) (exit status 6). Retrying. INFO[13692] Destroying volume jaypolicy2/sk1 DEBU[13692] Removing Use Lock: &config.UseMount{Volume:"jaypolicy2/sk1", Hostname:"engine3", Reason:"Create"} DEBU[13692] Removing Use Lock: &config.UseSnapshot{Volume:"jaypolicy2/sk1", Reason:"Create"} ERRO[13692] Returning HTTP error handling plugin negotiation: Creating Volume github.com/contiv/volplugin/errors.init [errors.go 98] github.com/contiv/volplugin/config.init [volume.go 444] github.com/contiv/volplugin/volplugin.init [volplugin.go 122] main.init [cli.go 66] runtime.main [proc.go 177] runtime.goexit [asm_amd64.s 1998] Formatting Volume github.com/contiv/volplugin/errors.init [errors.go 96] github.com/contiv/volplugin/config.init [volume.go 444] github.com/contiv/volplugin/volplugin.init [volplugin.go 122] main.init [cli.go 66] runtime.main [proc.go 177] runtime.goexit [asm_amd64.s 1998] Could not map "jaypolicy2.sk1": Command: [rbd map jaypolicy2.sk1 --pool rbd], Exit status 6, Runtime 213.526654ms (exit status 6) (rbd: sysfs write failed rbd: map failed: (6) No such device or address ) github.com/contiv/volplugin/storage/backend/ceph.(_Driver).mapImage [internals.go 44] github.com/contiv/volplugin/storage/backend/ceph.(_Driver).Format [ceph.go 136] github.com/contiv/volplugin/storage/control.FormatVolume [volume.go 82] github.com/contiv/volplugin/api.(_API).createVolume.func1 [handlers.go 37] github.com/contiv/volplugin/lock.(_Driver).ExecuteWithMultiUseLock [lock.go 93] github.com/contiv/volplugin/api.(_API).Create [handlers.go 100] github.com/contiv/volplugin/api.(_API).Create-fm [docker.go 32] github.com/contiv/volplugin/api.LogHandler.func1 [api.go 101] net/http.HandlerFunc.ServeHTTP [server.go 1618] github.com/contiv/volplugin/vendor/github.com/gorilla/mux.(_Router).ServeHTTP [mux.go 98] net/http.serverHandler.ServeHTTP [server.go 2081] net/http.(*conn).serve [server.go 1472] runtime.goexit [asm_amd64.s 1998]

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/contiv/volplugin/issues/470, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AABJ6_pa8aZlc7eJJGJUTEHq4t5wX5Znks5q9Q6TgaJpZM4KwRof.

jolly2 commented 7 years ago

Thanks Erik for the response. However, i found the cause of the problem. contiv createvolume command is creating rbd create volume, but with some new kernel features such as exclusive-lock, object-map, fast-diff, deep-flatten which are not supported yet. So, i had to intercept the code to disable these features for each volume instance that contiv created, before it maps the vol image to the block device. I tried adding a function for rbd disable feature in the code, but, that is returning exit status 1 from within contiv..

erikh commented 7 years ago

interesting. Can you give me a more concrete example of what is going on and what you had to do to get it to the point you're at now? Pastes, anything would be a huge help.

erikh commented 7 years ago

sorry if that seems like a repeat question; what I mean is the latest things only, sorry.

jolly2 commented 7 years ago

In my 3 node docker/ceph POC environment, the kernel rbd does not support these features exclusive lock, object map, fast-diff, and deep-flatten. It supports only layering feature. I am using ubuntu 16.04LTS with kernel version 4.4.xxx By default, ceph rbd creates images with all these features. As my kernel rbd does not support these features, rbd map from contiv was failing. I tried fixing by disabling these features per image basis for which contiv plugin does not seem to have the option to pass parameters to ceph/rbd. So, i had to fix it by disabling all these features in my whole POC environment in ceph.conf.

erikh commented 7 years ago

yeah, hm. OK. We typically only test on centos, this is very valuable data.

I'll try to visit this over the weekend. Sorry for any problems.