Open sthiriet opened 3 years ago
@sthiriet, thanks for reporting this one.
A few questions below ...
Which linux distro are you running at the host level?
Which linux distro are you running at the system container level?
Have you checked if the testcase on which you are basing your setup is properly working? If not already done, please do the following within your host / VM:
Also, to leave aside dns-resolution issues, instead of pinging docker.com
, try the ip-address of each container.
Hello @rodnymolina
@sthiriet, thanks for reporting this one.
A few questions below ...
* Which linux distro are you running at the host level?
uname -a
Linux s-VirtualBox 5.8.0-44-generic #50~20.04.1-Ubuntu SMP Wed Feb 10 21:07:30 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
* Which linux distro are you running at the system container level?
On first test, a standard docker alpine but same behaviour in tests with netsybox/alpine-docker-dbg
* Have you checked if the testcase on which you are basing your setup is properly working? If not already done, please do the following within your host / VM: * $ git clone sysbox * $ cd sysbox * $ make test-shell (to launch the sysbox-test privilege container) * $ bats -t tests/dind/swarm.bats (to execute the swarm testcase)
I've modified the tests as follow in order to reproduce the error :
#!/usr/bin/env bats
#set 3>/dev/tty
#BASH_XTRACEFD=3
#set -x
# Basic tests running docker inside a system container
#
load ../helpers/run
load ../helpers/docker
load ../helpers/net
load ../helpers/sysbox-health
function teardown() {
sysbox_log_check
}
function basic_test {
net=$1
# Launch swarm manager sys container
local mgr=$(docker_run --rm --name manager --net=$net ${CTR_IMG_REPO}/alpine-docker-dbg:latest tail -f /dev/null)
# init swarm in manager, get join token
docker exec -d $mgr sh -c "dockerd > /var/log/dockerd.log 2>&1"
[ "$status" -eq 0 ]
wait_for_inner_dockerd $mgr
docker exec $mgr sh -c "docker swarm init"
[ "$status" -eq 0 ]
docker exec $mgr sh -c "docker swarm join-token -q manager"
[ "$status" -eq 0 ]
local mgr_token="$output"
docker exec $mgr sh -c "ip a"
[ "$status" -eq 0 ]
local mgr_ip=$(parse_ip "$output" "eth0")
local join_cmd="docker swarm join --token $mgr_token $mgr_ip:2377"
# Launch worker node
local worker=$(docker_run --rm --name worker --net=$net ${CTR_IMG_REPO}/alpine-docker-dbg:latest tail -f /dev/null)
# Join the worker to the swarm
docker exec -d $worker sh -c "dockerd > /var/log/dockerd.log 2>&1"
[ "$status" -eq 0 ]
wait_for_inner_dockerd $worker
docker exec $worker sh -c "$join_cmd"
[ "$status" -eq 0 ]
# verify worker node joined
docker exec $mgr sh -c "docker node ls"
[ "$status" -eq 0 ]
# The output of the prior command is something like this:
#
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# by9ukwes9r9emn3pbozbh6dp6 7f62c95195dc Ready Active Reachable 19.03.12
# sfgwme7k5vol5ra3hf2jgwlfo * fc4c806f1598 Ready Active Leader 19.03.12
for i in $(seq 1 2); do
[[ "${lines[$i]}" =~ "Ready".+"Active" ]]
done
# deploy a service
docker exec $mgr sh -c "docker service create --restart-max-attempts 5 --replicas 4 --name helloworld alpine ping localhost"
[ "$status" -eq 0 ]
# verify the service is up
docker exec $mgr sh -c "docker service ls"
[ "$status" -eq 0 ]
[[ "${lines[1]}" =~ "$helloworld".+"4/4" ]]
# cleanup
docker_stop $mgr
docker_stop $worker
}
function service_com_test {
net=$1
# Launch swarm manager sys container
local mgr=$(docker_run --rm --name manager --net=$net ${CTR_IMG_REPO}/alpine-docker-dbg:latest tail -f /dev/null)
# init swarm in manager, get join token
docker exec -d $mgr sh -c "dockerd > /var/log/dockerd.log 2>&1"
[ "$status" -eq 0 ]
wait_for_inner_dockerd $mgr
docker exec $mgr sh -c "docker swarm init"
[ "$status" -eq 0 ]
docker exec $mgr sh -c "docker swarm join-token -q manager"
[ "$status" -eq 0 ]
local mgr_token="$output"
docker exec $mgr sh -c "ip a"
[ "$status" -eq 0 ]
local mgr_ip=$(parse_ip "$output" "eth0")
local join_cmd="docker swarm join --token $mgr_token $mgr_ip:2377"
# Launch worker node
local worker=$(docker_run --rm --name worker --net=$net ${CTR_IMG_REPO}/alpine-docker-dbg:latest tail -f /dev/null)
# Join the worker to the swarm
docker exec -d $worker sh -c "dockerd > /var/log/dockerd.log 2>&1"
[ "$status" -eq 0 ]
wait_for_inner_dockerd $worker
docker exec $worker sh -c "$join_cmd"
[ "$status" -eq 0 ]
# verify worker node joined
docker exec $mgr sh -c "docker node ls"
[ "$status" -eq 0 ]
# The output of the prior command is something like this:
#
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# by9ukwes9r9emn3pbozbh6dp6 7f62c95195dc Ready Active Reachable 19.03.12
# sfgwme7k5vol5ra3hf2jgwlfo * fc4c806f1598 Ready Active Leader 19.03.12
for i in $(seq 1 2); do
[[ "${lines[$i]}" =~ "Ready".+"Active" ]]
done
# create an overlay network
docker exec $mgr sh -c "docker network create -d overlay test-in-net"
[ "$status" -eq 0 ]
# deploy a target service to test network
docker exec $mgr timeout 60 sh -c "docker service create --network test-in-net --restart-max-attempts 5 --replicas 1 --name target nginx:alpine"
[ "$status" -eq 0 ]
# deploy a source service to test network using timeout as docker service create never returns on failure
docker exec $mgr timeout 60 sh -c "docker service create --network test-in-net --restart-max-attempts 5 --replicas 1 --name source alpine sh -c 'set -e; while true; do wget -q -O - target; sleep 2; done'"
[ "$status" -eq 0 ]
# verify the services are up
docker exec $mgr sh -c "docker service ls"
[ "$status" -eq 0 ]
[[ "${lines[1]}" =~ "$source".+"1/1" ]]
[[ "${lines[2]}" =~ "$target".+"1/1" ]]
# cleanup
docker_stop $mgr
docker_stop $worker
}
@test "swarm-in-docker basic" {
basic_test bridge
}
@test "swarm-in-docker custom net" {
docker network create test-net
[ "$status" -eq 0 ]
basic_test test-net
docker network rm test-net
[ "$status" -eq 0 ]
}
@test "swarm-in-docker basic service communication test" {
service_com_test bridge
}
@test "swarm-in-docker custom net service communication test" {
docker network create test-net
[ "$status" -eq 0 ]
service_com_test test-net
docker network rm test-net
[ "$status" -eq 0 ]
}
Results are :
root@sysbox-test:~/nestybox/sysbox# bats -t tests/dind/swarm.bats
1..4
ok 1 swarm-in-docker basic
ok 2 swarm-in-docker custom net
not ok 3 swarm-in-docker basic service communication test
# (from function `service_com_test' in file tests/dind/swarm.bats, line 142,
# in test file tests/dind/swarm.bats, line 172)
# `service_com_test bridge' failed
# docker run --runtime=sysbox-runc -d --rm --name manager --net=bridge ghcr.io/nestybox/alpine-docker-dbg:latest tail -f /dev/null (status=0):
# f89d162272803432230dd87f55fc1fc60f0f471e19702b1b6a1770f0cae84418
# docker ps --format {{.ID}} (status=0):
# f89d16227280
# docker exec -d f89d16227280 sh -c dockerd > /var/log/dockerd.log 2>&1 (status=0):
#
# docker exec f89d16227280 sh -c docker swarm init (status=0):
# Swarm initialized: current node (twoqefhhn5f663xbogw4hw9kc) is now a manager.
#
# To add a worker to this swarm, run the following command:
#
# docker swarm join --token SWMTKN-1-3h1vuoe3mikx2slnr9fz1wa45vfouy8azu1248rx0tlb3mmy41-0bobvuwih39upg18p0ak7btjv 172.21.0.2:2377
#
# To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
# docker exec f89d16227280 sh -c docker swarm join-token -q manager (status=0):
# SWMTKN-1-3h1vuoe3mikx2slnr9fz1wa45vfouy8azu1248rx0tlb3mmy41-8iyykpy3qofvl6964572jztzk
# docker exec f89d16227280 sh -c ip a (status=0):
# 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
# link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
# inet 127.0.0.1/8 scope host lo
# valid_lft forever preferred_lft forever
# 2: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
# link/ether 02:42:9e:97:2d:94 brd ff:ff:ff:ff:ff:ff
# inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
# valid_lft forever preferred_lft forever
# 13: eth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
# link/ether 02:42:ac:15:00:02 brd ff:ff:ff:ff:ff:ff
# inet 172.21.0.2/16 brd 172.21.255.255 scope global eth0
# valid_lft forever preferred_lft forever
# docker run --runtime=sysbox-runc -d --rm --name worker --net=bridge ghcr.io/nestybox/alpine-docker-dbg:latest tail -f /dev/null (status=0):
# be2e2aac46fafe86a428a70b6aa6235cded0c5098cd924db8665d8d013084785
# docker ps --format {{.ID}} (status=0):
# be2e2aac46fa
# f89d16227280
# docker exec -d be2e2aac46fa sh -c dockerd > /var/log/dockerd.log 2>&1 (status=0):
#
# docker exec be2e2aac46fa sh -c docker swarm join --token SWMTKN-1-3h1vuoe3mikx2slnr9fz1wa45vfouy8azu1248rx0tlb3mmy41-8iyykpy3qofvl6964572jztzk 172.21.0.2:2377 (status=0):
# This node joined a swarm as a manager.
# docker exec f89d16227280 sh -c docker node ls (status=0):
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# vtgvvjh8ksiagtqc4hn5wtlcp be2e2aac46fa Ready Active Reachable 19.03.12
# twoqefhhn5f663xbogw4hw9kc * f89d16227280 Ready Active Leader 19.03.12
# docker exec f89d16227280 sh -c docker network create -d overlay test-in-net (status=0):
# q00s6ru7h5efde9l73gssl3fh
# docker exec f89d16227280 timeout 60 sh -c docker service create --network test-in-net --restart-max-attempts 5 --replicas 1 --name target nginx:alpine (status=0):
# rwdil8nnkwodtlaojl3qzcs6l
# overall progress: 0 out of 1 tasks
# 1/1:
# overall progress: 0 out of 1 tasks
# overall progress: 0 out of 1 tasks
...
...
# overall progress: 1 out of 1 tasks
# verify: Waiting 5 seconds to verify that tasks are stable...
....
# verify: Waiting 1 seconds to verify that tasks are stable...
# verify: Service converged
# docker exec f89d16227280 timeout 60 sh -c docker service create --network test-in-net --restart-max-attempts 5 --replicas 1 --name source alpine sh -c 'set -e; while true; do wget -q -O - target; sleep 2; done' (status=143):
# ub3wztjj32irrpyl07537on7d
# overall progress: 0 out of 1 tasks
# 1/1:
# overall progress: 0 out of 1 tasks
# overall progress: 0 out of 1 tasks
# overall progress: 0 out of 1 tasks
...
# overall progress: 0 out of 1 tasks
# overall progress: 1 out of 1 tasks
# verify: Waiting 5 seconds to verify that tasks are stable...
...
# verify: Waiting 2 seconds to verify that tasks are stable...
# overall progress: 0 out of 1 tasks
# verify: Detected task failure
# overall progress: 0 out of 1 tasks
...
# verify: Waiting 2 seconds to verify that tasks are stable...
# verify: Waiting 2 seconds to verify that tasks are stable...
# overall progress: 0 out of 1 tasks
# verify: Detected task failure
# overall progress: 0 out of 1 tasks
...
# overall progress: 0 out of 1 tasks
not ok 4 swarm-in-docker custom net service communication test
# (from function `service_com_test' in file tests/dind/swarm.bats, line 95,
# in test file tests/dind/swarm.bats, line 180)
# `service_com_test test-net' failed
# docker network create test-net (status=0):
# acb2067d9b3a345491dcb735a81c3a68221784081ecda4daf412252333924dec
# docker run --runtime=sysbox-runc -d --rm --name manager --net=test-net ghcr.io/nestybox/alpine-docker-dbg:latest tail -f /dev/null (status=125):
# docker: Error response from daemon: Conflict. The container name "/manager" is already in use by container "f89d162272803432230dd87f55fc1fc60f0f471e19702b1b6a1770f0cae84418". You have to remove (or rename) that container to be able to reuse that name.
# See 'docker run --help'.
# docker ps --format {{.ID}} (status=0):
# be2e2aac46fa
# f89d16227280
# docker exec -d be2e2aac46fa sh -c dockerd > /var/log/dockerd.log 2>&1 (status=0):
#
# docker exec be2e2aac46fa sh -c docker swarm init (status=1):
# Error response from daemon: This node is already part of a swarm. Use "docker swarm leave" to leave this swarm and join another one.
@sthiriet, thanks for your detailed response and reproduction steps.
I suspect that problem is likely a consequence of Sysbox being currently unable to deal with IPVS instructions within a system container, which are usually required by Swarm to manage access to services exported through the "ingress" network.
What surprised me from your setup is that, at first glance, I don't see any service being exported (e.g. port-forwarding), that may require the use of ipvs. That's the reason I asked you to verify that traffic could flow outside the non-ingress network (i.e. the overlay network you created as well as through the regular docker_gwbridge iface). But I went ahead and answer those questions myself, thanks to your repro instructions.
I'd need to do some digging to fully connect the dots as I'm not 100% sure that ipvs is being the problem here. However, if you ever need to make use of the "ingress" network to offer access to your services from external parties, you will hit this Sysbox limitation anyways. We have ipvs feature in our roadmap though, so please stay tuned.
One question, what's the scenario that you have in mind and what's its purpose? I'm just asking to better understand the scope of what you're trying to accomplish so that we can prioritize this feature (full docker-swarm support within sysbox containers) accordingly.
Indeed I had not presented the context.
The first scenario is testing ansible roles that deploy and configure swarm clusters, theirs monitoring tools and applications. Today, we have a gitlab instance with shared runners (docker executor with privileges). Those runners will be removed soon for security reasons.
The second scenario I wanted to test was to use these sysbox docker runners to deploy a swarm application and perform our CI tests instead of using dedicated non-production swarm cluster.
I will look forward to these updates :)
Hi @rodnymolina, given that this appears to be related to IPVS not working inside Sysbox containers, I am wondering if we should we mark this as a duplicate of issue #189. What do you say?
Thanks @sthiriet, both use cases make perfect sense.
I fully understand that those privileged docker-executors represent a security risk, and this is precisely a natural use-case for Sysbox runtime.
In regards to the second use-case, if I understood you correctly, you want to use sysbox containers to deploy swarm services. This means that there's no need to run swarm inside the sysbox container, as the container would only run the apps that need to be tested. If that's the case, then everything should already work fine for you, as ipvs and all the docker-swarm networking magic would be done outside the sysbox container.
@ctalledo, yes, will mark this one as dup once/if I confirm that ipvs is the root-cause, but want to make sure that's the case before we do that.
Hi,
We - maybe - have the same problem here: we use sysbox to mock-up production cluster wich involve running swarm on some hosts. Swarm DNS resolver works but there is no way for swarm services containers to connect to each other.
To be more precise: communication between containers work but not service ip which, afaik, acts as a virtual ip and is responsible to forward the traffic to the containers.
Hi @dmarteau, yes this is almost certainly due to IPVS not working inside Sysbox containers. It's something we've been wanting to add for a while, but it's challenging and mainly impacts Docker swarm inside Sysbox, as K8s inside Sysbox works because it also supports iptables.
Atm, I have found a workaround by declaring endpoint_mode
as dnsrr
for bypassing IPVS in swarm services (see https://docs.docker.com/compose/compose-file/compose-file-v3/#deploy). This is ok as long you don't need load-balancing your services.
Atm, I have found a workaround by declaring
endpoint_mode
asdnsrr
for bypassing IPVS in swarm services (see https://docs.docker.com/compose/compose-file/compose-file-v3/#deploy). This is ok as long you don't need load-balancing your services.
Good hint, thanks!
The lxc containers have the same problem
Hi
Given the test suite, it's possible to launch swarm services in a docker swarm inside a sysbox system container but services on the same networks can't communicate with each other:
I initialize a swarm cluster with the help of this test file:
Then I created a network and two services:
Then I connect to the
firstservice
replica and try to pingsecondservice
:In manager's log: