IBM / ubiquity

Ubiquity
Apache License 2.0
90 stars 26 forks source link

Docker Swarm: service never launches container on worker nodes when using spectrum-scale volumes #174

Closed hseipp closed 6 years ago

hseipp commented 6 years ago

After starting Ubiquity 0.4.0 and docker 17.03.2 on RHEL7.4 with Spectrum Scale 4.2.3.5, docker info does not show ubiquity as plugin although the plugin is loaded. After I (successfully) create a volume using docker volume create -d ubiquity --opt backend=spectrum-scale --name demo1 docker info does show ubiquity (see output below). But then, when I start a service that requires an ubiquity volume docker service create --name helloworld --mode global --mount type=volume,volume-driver=ubiquity,source=demo1,destination=/mnt myregistry:5000/alpine /bin/sh -c 'sleep 4 && ip a > /mnt/hostname&& ping myhost >> /mnt/hostname' the service does only get executed on the leader node. However, when I do a mmdsh -N all docker volume ls after launching the service, the container instances for the service "automagically" start. Note that this might be related to the missing capabilities API, I'm getting a lot of dockerd: time="2017-12-18T14:23:37.036066601+01:00" level=warning msg="Volume driver ubiquity returned an error while trying to query its capabilities, using default capabilties: VolumeDriver.Capabilities: 404 page not found\n" log entries

Docker info output:


Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 4
Server Version: 17.03.2-ce
Storage Driver: devicemapper
 Pool Name: docker-253:0-806998303-pool
 Pool Blocksize: 65.54 kB
 Base Device Size: 10.74 GB
 Backing Filesystem: xfs
 Data file: /dev/loop4
 Metadata file: /dev/loop5
 Data Space Used: 556.8 MB
 Data Space Total: 107.4 GB
 Data Space Available: 106.8 GB
 Metadata Space Used: 1.204 MB
 Metadata Space Total: 2.147 GB
 Metadata Space Available: 2.146 GB
 Thin Pool Minimum Free Space: 10.74 GB
 Udev Sync Supported: true
 Deferred Removal Enabled: false
 Deferred Deletion Enabled: false
 Deferred Deleted Device Count: 0
 Data loop file: /var/lib/docker/devicemapper/devicemapper/data
 Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
 Library Version: 1.02.140-RHEL7 (2017-05-03)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: ubiquity local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: rsuvy5c4yy417vmnol0pxpcyh
 Is Manager: true
 ClusterID: cbn8osht37rtl42amdbe6sr68
 Managers: 1
 Nodes: 6
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 10.10.2.154
 Manager Addresses:
  10.10.2.154:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 3.10.0-693.el7.x86_64
Operating System: Red Hat Enterprise Linux Server 7.4 (Maipo)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 70.45 GiB
Name: mxf54mz.dmc4mz.com
ID: U2RR:ER52:NQYS:5UQN:T2CV:YGAT:YXIN:EVMZ:VEEU:P4AU:RY7Z:2FD2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: devicemapper: usage of loopback devices is strongly discouraged for production use.
         Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
WARNING: bridge-nf-call-ip6tables is disabled```
shay-berman commented 6 years ago

@aswarke @yadaven is this ticket still relevant?

hseipp commented 6 years ago

FYI - just retested with Spectrum Scale 5.0.0.1 with everything else (Ubiquity, Docker) unchanged, getting the same error as described above. Please let me know when you've got working Ubiquity head code for Scale, will then re-test again.

aswarke commented 6 years ago

@hseipp Capabilities API along with the new plugin v2 architecture will be addressed and fixed in the next ubiquity release for Scale. We will keep you updated on this. The default mode in Docker Swarm allows for Manager Node to run tasks as well. If you'd like the tasks to only run on worker nodes and not the Manager node then you need to put the Manager in DRAIN state. For more information please refer to https://docs.docker.com/engine/swarm/swarm-tutorial/drain-node/

hseipp commented 6 years ago

@aswarke Thank you for the feedback, I am looking forward to testing the next ubiquity release, but please note that with Ubiquity 0.4.0 in the initial state, tasks are only run on the master node, ie. the worker nodes are never used unless someone issues mmdsh -N <all docker nodes> docker volume ls.

oweiser commented 6 years ago

tried the pluginv2 and scale 5.0.1.. and seems we still have that issue with nodes, being afterwards added to the swarm can not execute the plugin Gaurang is helping me.. / keep you posted the container simply does not start

oweiser commented 6 years ago

update : after changing the version in the compose file to 3.3 version: "3.3" docker restart ... the port 9999 is distributed as expected.. so pluginv2 works also on newly added worker nodes

shay-berman commented 6 years ago

swarm support is not relevant for now. if needed we will open it in the future