moby / libnetwork

networking for containers
Apache License 2.0
2.15k stars 882 forks source link

Creating networks via docker #150

Closed squaremo closed 9 years ago

squaremo commented 9 years ago

There's not much point in having the whole driver subsystem unless one can actually create a network. I don't see that as part of docker/docker#13060.

If it is a requirement to not add new commands to docker (e.g.,docker network create), then perhaps it can be part of the syntax of the --net argument. For example,

docker run --net=weave:mynet

where weave refers to the driver, and mynet is a network created with that driver (if it doesn't already exist).

mrjana commented 9 years ago

@squaremo docker/docker#13060 is only the first of a set of PR which will made in there. So the short answer is it's coming. The idea is to implement the majority of the CLI handles and remote API handlers in libnetwork itself and just hook it up to docker core with a small function. Introducing a new UI or remote API in docker needs more discussion and that's why the goal of docker/docker#13060 has been limited to modularize networking code out of docker and providing a clean interface to networking code for docker or anybody else to use.

BTW, for the initial implementation of CLI handlers take a look at libnetwork/client code

mavenugo commented 9 years ago

Also, we are currently integrating the libnetwork/client with libnetwork/api as we speak and we will have a dnet tool that makes use of these. As @mrjana suggested, the docker CLI integration will follow soon after https://github.com/docker/docker/pull/13060 is merged.

squaremo commented 9 years ago

Another requirement: the ability to add more than one network in the same invocation is quite important. Typically one interface will provide internet-in-general access, and one will be the "cluster" network (e.g., weave).

dave-tucker commented 9 years ago

@squaremo so assuming weave1 is a network created using the weave driver, you'd want --net=bridge --net=weave1?

squaremo commented 9 years ago

so assuming weave1 is a network created using the weave driver, you'd want --net=bridge --net=weave1?

Yes exactly. One wrinkle is that typically we'd want the bridge driver to set the default gateway, but the weave driver to provide (or at least influence) the /etc/resolv.conf.

dave-tucker commented 9 years ago

Thanks @squaremo, @mavenugo this seems like a sensible requirement. Are multiple occurrences of the --net flag going to be supported in the integration PR mentioned above? If not, can we add them?

mavenugo commented 9 years ago

@dave-tucker @squaremo no. I will be pushing https://github.com/mavenugo/docker/commit/c0c7f372375b3bd112c6e0aaa28001f495273782 shortly. This is based on the discussions that i followed for volumes, the idea is to introduce --network-driver instead of overloading the --net string. We need to find a way with these constraints.

squaremo commented 9 years ago

@mavenugo Does --network-driver coexist with --net; i.e., could one use

docker run --net=bridge --network-driver=weave -ti ubuntu

and expect a bridge interface as well as a weave interface?

dave-tucker commented 9 years ago

@mavenugo right, I'm not talking about overloading of --net here, I'm talking about supporting multiple occurrences of --net assuming that a network has already been created using the docker network create CLI

shettyg commented 9 years ago

If there are going to be multiple --net's, is there an idea on how to pass network labels per --net?

mavenugo commented 9 years ago

@dave-tucker @squaremo @shettyg All are valid and reasonable questions. There are a few trade-offs to be made between simple, consistent UI & functionality. I will get back to you all later today.

dave-tucker commented 9 years ago

It hasn't been discussed here yet, but my expectation is that labels are namespaced per the docs

docker run -it --net=bridge --net=weave1 -net=vmware1 -l works.weave.foo=bar -l com.vmware.baz=quux debian:jessie

A driver should get passed all labels, but only respond to those in it's namespace

squaremo commented 9 years ago

It hasn't been discussed here yet, but my expectation is that labels are namespaced per the docs

docker run -it --net=bridge --net=weave1 -net=vmware1 -l works.weave.foo=bar -l com.vmware.baz=quux debian:jessie

A driver should get passed all labels, but only respond to those in it's namespace

Ah, but the label namespaces correspond to the driver name, rather than the network name. So if a container has two endpoints provided by the same driver, how will that driver know which labels apply to which endpoint?

dave-tucker commented 9 years ago

Prefix the label with the network name? E.g works.weave.weave1.foo

mrjana commented 9 years ago

@squaremo @dave-tucker @mavenugo Is it very important to support to support multiple networks in docker run command. docker run can be used to join the initial network. There are always going to be network service join commands to join additional endpoints after the initial join. I know this is racy to some extent if the application wants to communicate in the "cluster" network immediately but then we are probably risking introducing something hastily here. Please let me know what you guys think?

Also the labels in the docker run command are always going to be considered as container labels and are given to the driver only on joins

shettyg commented 9 years ago

There are always going to be network service join commands to join additional endpoints after the initial join. I know this is racy to some extent if the application wants to communicate in the "cluster" network immediately but then we are probably risking introducing something hastily here.

@mrjana, the raciness may be important IMO. Many badly written applications will simply fail when they can't reach out to peers. So you are effectively expecting applications to be written in a way that IP reachability to a peer should always have retry logic.

mrjana commented 9 years ago

@shettyg I am not saying solving the raciness is not important. But it is much more important to get the UI right, otherwise it is very difficult to revert it back.

bboreham commented 9 years ago

@shettyg I think the race condition manifests in ways that are worse than "simply fail". E.g. if your container has a service that listens on 0.0.0.0, it will listen on all interfaces that are active at the time you do the listen; it will not add interfaces that are added later.

squaremo commented 9 years ago

I know this is racy to some extent if the application wants to communicate in the "cluster" network immediately

It's already the case that you can add a weave interface to a container after the fact; avoiding the extra command (and the entailed race) is the primary motivation for developing a plugin.

Weave needs containers to have an interface on the bridge network _aswell, and for this not to be racy or require extra commands, since it's used to provide name resolution.

shettyg commented 9 years ago

Since endpoint creation has been separated out, would something like the following not work for anyone?

With the UUID returned by the endpoint creation (Similar syntax as --net=container:containerid):

The above only calls join() and skips createendpoint().

squaremo commented 9 years ago

@shettyg There is an important difference between an endpoint that is created during docker run and an endpoint created ad-hoc; the former is garbage collected when the container stops, but the latter will hang around until deleted explicitly. That latter behaviour requires much more diligence on the part of the user, especially since it won't necessarily be obvious when those endpoints can reasonably be deleted.

shettyg commented 9 years ago

@squaremo I see what you mean. I wonder whether we can do both. For e.g., one could do: docker run --net=bridge --net=driver1:mynet

In the above case, you would call createendpoint() and join()

Alternatively, docker run --net=bridge --endpoint=driver1:uuid

In the above case, you would only call join() My suggestion likely won't fly because now there is a new docker cli option '--endpoint'.

(According to @mrjana, labels provided to createndpoint() and join() are different, so that is another constraint with which this has to work)

squaremo commented 9 years ago

I believe https://github.com/docker/docker/pull/13441 will address this issue, in part, but seems caught up in bikeshedding :(

shettyg commented 9 years ago

It looks like Solomon's 'docker run --publish-service db.work-dev' is similar to my 'docker run --endpoint=driver1:uuid'

tomdee commented 9 years ago

@mavenugo @dave-tucker can this issue be closed now that it is tracked under https://github.com/docker/docker/issues/14593

dave-tucker commented 9 years ago

Thanks @tomdee, closing in favour of docker/docker#14593