weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

command to wait for connection establishment #1267

Open rade opened 9 years ago

rade commented 9 years ago

Weave connection establishment is asynchronous. This can be problematic in testing, and presumably also production scripting. e.g.

host1:~$ weave launch
host1:~$ eval $(weave env)
host1:~$ docker run --name=foo -dit gliderlabs/alpine /bin/sh
host2:~$ weave launch $HOST1
host2:~$ eval $(weave env)
host2:~$ docker run --rm gliderlabs/alpine ping -c 1 foo

isn't guaranteed to succeed because the connection between the two routers may not be fully established by the time the 'ping' runs.

I suggest we introduce a command, perhaps weave wait that waits for all connections to become established.

The challenge is figuring out what exactly we mean by that...

Strawman: The local peer is not attempting to establish any connections for the first time, and all its connections are 'established'.

dpw commented 9 years ago

This can be problematic in testing, and presumably also production scripting.

It's tangential, but I don't think the feature you are suggesting is what's needed in a production setting. There, it would be unusual that a client container really wants to wait for the the network to become fully established (which, as you note, can be hard to define). Rather it wants to wait until particular services are started and ready to receive requests/connections. In a DNS setting, the waiting part can be accomplished by polling. The "updating DNS when a service is ready" part might not be so straightforward.

In my experience, this kind of thing makes a big difference when trying to run (what would now be called) microservices-based systems in production. One alternative is to say "the components have to be started in a certain order"; but that becomes tricky when you want to restart only a subset of components (in order to fix an issue or deploy a bug fix). Another alternative is to say "all clients should tolerate failed requests, and retry them until the server responds". But then you have to distinguish between those transient expected errors and errors that mean that a client is having real problems talking to a server. A positive indication that the server is in fact ready is a much more robust approach.