Open jscottnz opened 4 years ago
10.0.0.3
looks to be the address of the container on the internal container-container network; are you cable to access the container using the IP address of the host? (10.0.0.50:8080
) ?
Hi there. Sorry, that was a copy and paste error. I've updated the issue. 10.0.0.50 is the docker host. 10.0.0.3 is another non-docker host on the 10.0.0.x network, such has a load balancer or jumphost
(jumphost) 10.0.0.3 -> (docker host) 10.0.0.50:8080 -> (container) nginx:80
I have tried to create a service with a published port on all "19.03" docker minor versions. The service created is reachable on docker 19.03.04. But from 19.03.05, the service is not reachable.
@thaJeztah Any progress here? Destroying and rebuilding the swarm really isn't something I want to do, and the fact that this is a clear stopper and has not been assigned is making me feel like the internal expectation is that Mirantis will solve this... (not holding breath on that one)
I'm happy to jump into the code and see if I can find the issue if it's looking like it will get solved soon..
R
Found it... give me a few minutes to confirm the fix (it IS working, but not sure why the patch didn't fix the real issue)..
R
This appears a result of https://github.com/docker/for-linux/issues/810 , and while the original was closed as "known"??? The workaround presented by @andrewhsu works, but leaves me scratching my head... as it should be part of the stack config when docker comes up (or shortly after)..
The original release note is as follows:
## 19.03.3 (2019-10-07)
### Known Issues
- `DOCKER-USER` iptables chain is missing [docker/for-linux#810](https://github.com/docker/for-linux/issues/810). Users cannot perform additional container network traffic filtering on top of this iptables chain. You are not affected by this issue if you are not customizing iptables chains on top of `DOCKER-USER`.
Workaround is to insert the iptables chain after docker daemon starts.
iptables -N DOCKER-USER
iptables -I FORWARD -j DOCKER-USER
iptables -A DOCKER-USER -j RETURN
if you run this as root on all your nodes, the issue is resolved but may expose you to other issues... The notes around it imply there may be a firewall bypass issue if you do this...
iptables -N DOCKER-USER
iptables -I FORWARD -j DOCKER-USER
iptables -A DOCKER-USER -j RETURN
Maybe it's my eyes, but it looks like the fix (https://github.com/moby/libnetwork/pull/2464) done by @arkodg and PR'd was closed and never accepted?
I briefly reviewed the code in https://github.com/moby/libnetwork/pull/2470/commits/8cdd5a34cf0d31c3d0b18442ff7cd745386da612#diff-e30be89bfd41a0c219178028b9971a32 which appears to be an attempt to integrate the functionality of @andrewhsu PR, but it appears to me (and I'm no GO expert), that the check falls short, looking to see if the DOCKER-USER entry is there, but not ensuring the other two entries are also there... The first without the others doesn't solve the problem.
If this IS the case, any test of this could must check for both cases: DOCKER-USER not present, and DOCKER-USER present but misconfigured.
Please... correct me if I'm wrong (and I probably am since Dockers guts isn't my thing)...
R
The results above ended up being pretty spotty... sometimes it works, sometimes it doesn't, and I couldn't find a common reason why it did or didn't work. Ultimately, it looks pretty arbitrary.
As an alternative, I did revert back to 19.03.04 and everything works perfectly as far as I can see..
Didn't have to leave the swarm or re-create it. Just ran this and everything seems to be working fine..
sudo apt remove -y docker-ce docker-ce-cli
sudo apt install -y docker-ce=5:19.03.4~3-0~ubuntu-bionic docker-ce-cli=5:19.03.4~3-0~ubuntu-bionic
sudo apt autoremove -y
sudo reboot
Since it looks like @thaJeztah merged the change (not certain) that fixed the DOCKER-USER issue in 19.03.04, perhaps reviewing subsequent builds to see if the changes pushed between 19.03.04 and 19.03.05-beta1 were removed or obviated for some reason.
Expected behavior
Swarm created should be accessible (
curl 10.0.0.50:8080
) from machine on same networkActual behavior
Swarm created is not accessible (
curl 10.0.0.50:8080
) from machine on same network.Steps to reproduce the behavior
This problem involves two machines, ( 10.0.0.50 ) a docker host and any other machine on the 10.0.0.x network, ie a load balancer or jumphost
On centos 7 all updated and patched, on a vm on a cloud platform, follow docker installing guide for version 18.
Run nginx as a swarm service:
Test and note that nginx is accessible (
curl 10.0.0.50:8080
) from another host on the same network.Upgrade docker to version 19.
Test and note that nginx is accessible (
curl 10.0.0.50:8080
) from another host on the same network.Destroy the swarm:
Run nginx as a swarm service:
Test and note that nginx is NOT accessible (
curl 10.0.0.50:8080
) from another host on the same network.This behaviour can also be reproduced with a fresh installation of version 19.
You can also uninstall docker-ce 19 and install 18. The swarm created in 19 is still not accessible. If you remove the swarm and create it (in version 18) it is accessible.
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.) Running in a data centre on vms.