moby / libnetwork

networking for containers
Apache License 2.0
2.16k stars 881 forks source link

Libnetwork breaks networking by creating unnecessary endpoints when EXPOSE is used. #401

Closed tomdee closed 7 years ago

tomdee commented 9 years ago

If a user starts a container with docker run --publish-service solrbad2.net2.calico --name solrbad2 -ti makuk66/docker-solr:latest bash they expect to get a single endpoint on the net2 network.

If the Dockerfile they are using contains an EXPOSE line then libnetwork also creates an additional endpoint on the bridge network. This bridge endpoint gets used for the default route which breaks networking for the intended driver.

1) Why is this endpoint created? What purpose does it serve? 2) Why does the bridge network get installed as the default route?

For more details and a repro, see https://github.com/Metaswitch/calico-docker/issues/341

lxpollitt commented 9 years ago

cc @dave-tucker

mrjana commented 9 years ago

@tomdee @lxpollitt This is an intentional stop-gap measure to provide external connectivity to containers which want to expose their services external to hosts outside the docker network. Ideally we want the driver to provide the external connectivity. That only happens if the daemon knows that the container wants one or more ports to exposed or mapped either using EXPOSE directive or using -p option. This is why you are seeing additional implicit endpoints created when you do that.

The reason libnetwork chooses bridge for default gateway is that libnetwork resolves conflict using the following rules:

Currently the UI for assigning priority hasn't been added to the docker experimental branch. So although the backend code to achieve this is present there is no way yet to assign a priority by the user. Until this is available the workaround is to use network name for your network which appears first in lexicographic order when compared to bridge network.

dave-tucker commented 9 years ago

I can close https://github.com/docker/libnetwork/issues/382 in preference of this until we find a better solution. Silently attaching to "bridge" and handling default routes via preference isn't going to be great imo

dave-tucker commented 9 years ago

Note: My preference is to defer port mapping operations to the driver. We can also provide a library that we use in some of the builtins for handling this iptables/nftables/ufw/firewalld et. al so others can re-use if they want.

mrjana commented 9 years ago

@dave-tucker That was the original and intended design to use the driver to achieve external connectivity. In fact we have a design on how to do that for overlay driver. It just hasn't been done yet and we wanted to provide a way for the experimental users to still have external connectivity while the work for this is in progress.

tomdee commented 9 years ago

This behavior shouldn't be triggered by the EXPOSE directive though. Per the docs at https://docs.docker.com/reference/builder/#expose no external ports are mapped unless the user passing -p/-P when doing the docker run.

So - can we get a fix where the this behavior is only triggered if the user actually requests it with -p/-P?

dave-tucker commented 9 years ago

@tomdee hmmm interesting. You are right we should only map ports from an EXPOSE directive using -P

mavenugo commented 9 years ago

@tomdee @dave-tucker Yes. i think EXPOSE and --expose should not add the container to the bridge network. It must be done only for -P and -p directive only.

This is not a bug in libnetwork though. rather it must be addressed in docker/docker.

lxpollitt commented 9 years ago

@mrjana Can you expand a bit on the planned endpoint priority feature? (I think some people might argue that IP routing already has priority concept: longest prefix match; so why do we need something different here?) More broadly I know that the whole "container connected to multiple networks" is a thorny issue, and this is just one part of it. I think it would be really helpful to have the intended design (for the container multiple networks stuff) documented somewhere so the community can engage with it and comment. Does such a document already exist that you can point me at?

mavenugo commented 9 years ago

@lxpollitt the endpoint priority feature is already in the backend. PTAL at https://github.com/docker/libnetwork/pull/212 for the changes that @mrjana added. We have to fix the UI to make use of the backend. This discussion is about the default-gateway and not routing based on longest prefix. Container connected to multiple network via independent endpoints is the basis of CNM and is well documented under the original proposal https://github.com/docker/docker/issues/9983 . We can certainly add to the design document to explain more of the implementation details if required.

lxpollitt commented 9 years ago

Yes, I know that containers connected to multiple networks via independent endpoints is front and center in the CNM. I may be misremembering, but does the CNM mention anything about default gateways or priorities? More broadly I am unclear on how it is intended that traffic from the container will get routed via the correct endpoint in the general case when a container is connected to multiple networks. If this is already explained somewhere then it would be great if you could point me at it. (And apologies if I missed it; there have been a lot of PRs / issue discussions to track and I doubt I've managed to follow every one.) Any help gratefully received.

mavenugo commented 9 years ago

@lxpollitt I think we are discussing the topic of default gateway in the Issue that is opened to address the EXPOSE issue. Shall we discuss this in a different issue / IRC ?

Just to make sure we wrap this discussion logically ... Needless to say, default gateway is a pretty obvious requirement. But, in the presence of multiple networks, how would we choose the default ? Hence we added the priority logic to let the user select the network of choice to be the default via https://github.com/docker/libnetwork/pull/212. Default gateway kicks in only when there is no matching routes or directly connected network. Also, thanks to @tomdee he added #240 to insert static routes to perform more advanced route insertion.

lxpollitt commented 9 years ago

@mavenugo Yes, happy to discuss this in whichever forum you would like. My intention wasn't to bog down this individual PR, I was just flagging that I don't feel I have a good enough understanding of how the "container attached to multiple networks" is supposed to work (in terms of design) to comment on whether the planned endpoint priority feature is a step in the right direction or not.

hariharan16s commented 9 years ago

@mavenugo : Hi Madhu, When I start the docker container the default route is not getting created, Can you please let me know how I can create the default gateway? I am using --net=bridge to start the container.

I am a newbie to Go lang. I see the SetGateway func which says it sets the default gateway when the container joins the endpoint (IPv4). Is that what I need to use?

ajaybhatnagar commented 9 years ago

When creating a container, two interfaces get created with same IP.

root@expdocker-01:~# docker exec -it 7970652d8512 bash root@7970652d8512:/es# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:ac:11:00:01 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global eth0 valid_lft forever preferred_lft forever inet6 fe80::42:acff:fe11:1/64 scope link valid_lft forever preferred_lft forever 7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default link/ether 02:42:a0:11:7b:a7 brd ff:ff:ff:ff:ff:ff inet 172.21.0.1/16 scope global eth1 valid_lft forever preferred_lft forever inet6 fe80::42:a0ff:fe11:7ba7/64 scope link valid_lft forever preferred_lft forever

GordonTheTurtle commented 7 years ago

@tomdee It has been detected that this issue has not received any activity in over 6 months. Can you please let us know if it is still relevant:

Thank you! This issue will be automatically closed in 1 week unless it is commented on. For more information please refer to https://github.com/docker/libnetwork/issues/1926