Closed bdentino closed 2 years ago
it looks like https://github.com/weaveworks/weave/issues/1245 explains what's going on here. Which leaves me with a question about using weave for load balancing across containers on the same host. Say I have 3 containers for service1
on the same host. On that same host, I have a service2
container which makes http requests to http://service1.weave.local/data
. Based on the info in aforementioned issue, depending on the application (using curl vs netcat vs nodejs http module, etc) there's a good chance that all calls would resolve to the same IP address. In which case it seems pointless to even have extra instances of the container running on the same host. Is that a correct assessment? Is there a recommended way to achieve consistent load balancing across containers on the same host? Maybe @2opremio would have some advice for me?
@bdentino Thanks for bringing this up! @errordeveloper will take care of this one.
EDIT: my analysis in this comment is wrong, please read the comments from @awh below for an explanation on why load balancing is biased.
@bdentino I will take the questions from your comment ( @errordeveloper will take care of the guide-related part)
Say I have 3 containers for service1 on the same host. On that same host, I have a service2 container which makes http requests to http://service1.weave.local/data. Based on the info in aforementioned issue, depending on the application (using curl vs netcat vs nodejs http module, etc) there's a good chance that all calls would resolve to the same IP address. In which case it seems pointless to even have extra instances of the container running on the same host. Is that a correct assessment?
I have tested this scenario and your analysis does not seem to be correct. When using getaddrinfo()
to resolve the server (which is the most common scenario in Unix systems nowadays), load balancing will span across all the server containers if they all live in the same host and they are different from the client container.
However in the following two cases:
getaddrinfo()
will favor local hosts (server containers in our case). For (1) those will be server containers placed in the same machine as the client. For (2) that will be the client container itself.
In (1) it makes perfect sense to favor local containers, case (2) is questionable though.
I have tested the scenario you mentioned: 3 server containers and a client (different to the servers) living in the same host, and load balancing spans across all server containers.
Note that I am using the ubuntu:latest
image which currently includes nc.openbsd
which uses getaddrinfo()
I spawn 3 server containers in the same host, associated to hostname foo
and make sure that everything works as expected:
vagrant@vagrant-ubuntu-vivid-64:~$ weave launch-router
c626e9130caf897993359481eca32f1283cd4d381846f6d881ad6cbf37e49558
vagrant@vagrant-ubuntu-vivid-64:~$ weave launch-proxy --hostname-match '([^-]+)-[0-9]+'
9188a6ed31f8dd5d69ea8a784815f9b558331c069e1a3988403bfc9260484be5
vagrant@vagrant-ubuntu-vivid-64:~$ eval $(weave env)
vagrant@vagrant-ubuntu-vivid-64:~$ docker run -d --name=foo-1 ubuntu sh -c 'while true; do echo "I am foo-1" | nc -l -p 4567 ; done'
17181a71b42c988ffbcb8408eb29d6b3bc48497ccb6a3817d1856b1ab8dcebc4
vagrant@vagrant-ubuntu-vivid-64:~$ docker run -d --name=foo-2 ubuntu sh -c 'while true; do echo "I am foo-2" | nc -l -p 4567 ; done'
b6af3e56a31e91ad6e451dcc86a8f04417ccc965a43c5a4d5e0035cdc3f7546d
vagrant@vagrant-ubuntu-vivid-64:~$ docker run -d --name=foo-3 ubuntu sh -c 'while true; do echo "I am foo-3" | nc -l -p 4567 ; done'
vagrant@vagrant-ubuntu-vivid-64:~$ weave status dns
foo 10.32.0.1 17181a71b42c a2:ef:27:ed:41:1d
foo 10.32.0.2 b6af3e56a31e a2:ef:27:ed:41:1d
foo 10.32.0.3 cccfc2af1b4b a2:ef:27:ed:41:1d
vagrant@vagrant-ubuntu-vivid-64:~$ weave dns-lookup foo
10.32.0.3
10.32.0.2
10.32.0.1
vagrant@vagrant-ubuntu-vivid-64:~$ docker run --rm -ti ubuntu nc 10.32.0.1 4567
I am foo-1
vagrant@vagrant-ubuntu-vivid-64:~$ docker run --rm -ti ubuntu nc 10.32.0.2 4567
I am foo-2
vagrant@vagrant-ubuntu-vivid-64:~$ docker run --rm -ti ubuntu nc 10.32.0.3 4567
I am foo-3
I launch a client container which connects to the foo
hostname several times, reaching all 3 servers:
vagrant@vagrant-ubuntu-vivid-64:~$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-3
I am foo-1
I am foo-2
I am foo-3
I am foo-2
I am foo-1
I am foo-3
I am foo-3
I am foo-1
I am foo-2
Is there a recommended way to achieve consistent load balancing across containers on the same host?
It should just work as I demonstrated above :)
wow thanks for the detailed response! I think you're correct, and I'm not exactly sure why I was seeing the behavior I did yesterday, but I'm able to reproduce it by taking your example one step further:
Following your steps, everything works great on my host with 3 service containers:
vagrant@weave-gs-01:~$ weave launch-router
feaf0b0b69ba6973a744bf1a8f935c6a5bd5909ee394a25203e2d79bf9d5aa58
vagrant@weave-gs-01:~$ weave launch-proxy --hostname-match '([^-]+)-[0-9]+'
4ae27a3b47e5ea0608139f57d9bcb807bcc3c82406995232bde440804e80d37d
vagrant@weave-gs-01:~$ eval $(weave env)
vagrant@weave-gs-01:~$ docker run -d --name=foo-1 ubuntu sh -c 'while true; do echo "I am foo-1"
> ^C
vagrant@weave-gs-01:~$ docker run -d --name=foo-1 ubuntu sh -c 'while true; do echo "I am foo-1" | nc -l -p 4567 ; done'
15fe62e49fea3246ad6b6894febae845da57ee8baaebbb4c036fcfd73eada907
vagrant@weave-gs-01:~$ docker run -d --name=foo-2 ubuntu sh -c 'while true; do echo "I am foo-2" | nc -l -p 4567 ; done'
08382422e15fd145570fcd66ffa2df52b8a4c848f6c6666d900907f7a5262424
vagrant@weave-gs-01:~$ docker run -d --name=foo-3 ubuntu sh -c 'while true; do echo "I am foo-3" | nc -l -p 4567 ; done'
9a21bb9dc4e5025fd76da7c8ba0e225f426d7c88a347d75ac267b994d2442c33
vagrant@weave-gs-01:~$ weave status dns
foo 10.32.0.2 08382422e15f de:78:13:c0:c1:36
foo 10.32.0.1 15fe62e49fea de:78:13:c0:c1:36
foo 10.32.0.3 9a21bb9dc4e5 de:78:13:c0:c1:36
vagrant@weave-gs-01:~$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-1
I am foo-1
I am foo-3
I am foo-1
I am foo-1
I am foo-1
I am foo-3
I am foo-3
I am foo-3
I am foo-2
However, when I add one more foo
service, all client calls are directed to the most recent, and the previous three are ignored.
vagrant@weave-gs-01:~$ docker run -d --name=foo-4 ubuntu sh -c 'while true; do echo "I am foo-4" | nc -l -p 4567 ; done'
c8f24db9be221653f4fc03a4440fda17bf0b3a14b6f3a824af7aeab2f222c732
vagrant@weave-gs-01:~$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
vagrant@weave-gs-01:~$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
Would you know how to explain this behavior? My understanding of dns and ip routing is admittedly limited.
i guess i should first ask if you're actually able to reproduce it :stuck_out_tongue_winking_eye:
@bdentino I can reproduce it (with weave/master for the record). Uhm, let me look into this and get back to you.
In the meantime, @rade / @tomwilkie Do you have any idea of why this is happening? AFAIU getaddrinfo()
shouldn't care about the number of records if they are all co-located in the same host.
My first attempts to reproduce with four containers:
vagrant@vagrant-ubuntu-vivid-64:~/weave$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-2
I am foo-1
I am foo-3
vagrant@vagrant-ubuntu-vivid-64:~/weave$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-3
I am foo-3
I am foo-3
vagrant@vagrant-ubuntu-vivid-64:~/weave$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-2
I am foo-2
I am foo-2
I am foo-3
I am foo-4
vagrant@vagrant-ubuntu-vivid-64:~/weave$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc foo 4567; done'
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-4
I am foo-2
I am foo-2
I am foo-3
Will continue to investigate.
First an explanation of why I am seeing containers other than foo-4
; here's the test again with -v
:
$ docker run --rm -ti ubuntu bash -c 'for _ in `seq 10`; do nc -v foo 4567; done'
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-4
nc: connect to foo port 4567 (tcp) failed: Connection refused
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-3
nc: connect to foo port 4567 (tcp) failed: Connection refused
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-1
nc: connect to foo port 4567 (tcp) failed: Connection refused
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-3
nc: connect to foo port 4567 (tcp) failed: Connection refused
Connection to foo 4567 port [tcp/*] succeeded!
I am foo-3
In other words, nc
is trying to connect to foo-4
each time as with your example runs, but on some occasions the netcat in that container hasn't been respawned by the while loop yet - in this case, netcat skips to the next address in the list. So the question remains - what is special about the fourth container? As a refresher, here are the allocated addresses:
$ weave status dns
foo 10.32.0.1 413d2f135742 5a:63:a2:3f:bf:00
foo 10.32.0.4 9445e536a42d 5a:63:a2:3f:bf:00
foo 10.32.0.2 9c50beb6fded 5a:63:a2:3f:bf:00
foo 10.32.0.3 d31f09fc75ff 5a:63:a2:3f:bf:00
And here is the address allocated to the next container we run, e.g. the one we're using as a client:
$ docker run --rm -ti ubuntu ip -4 addr show ethwe
279: ethwe: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1410 qdisc pfifo_fast state UP group default qlen 1000
inet 10.32.0.5/12 scope global ethwe
valid_lft forever preferred_lft forever
Now, if we refer to the RFC3484 section on destination address selection we find the following rule:
Rule 9: Use longest matching prefix.
When DA and DB belong to the same address family (both are IPv6 or
both are IPv4): If CommonPrefixLen(DA, Source(DA)) >
CommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if
CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)),
then prefer DB.
which states that for each possible destination address (10.32.0.1-10.32.0.4 in our case), determine the source address that would be used to reach it (always 10.32.0.5 in our case), compute the common prefix length between source and destination and prefer longer matches. Here are the addresses in binary:
10.32.0.1 00001010 00100000 00000000 00000001
10.32.0.2 00001010 00100000 00000000 00000010
10.32.0.3 00001010 00100000 00000000 00000011
10.32.0.4 00001010 00100000 00000000 00000100
10.32.0.5 00001010 00100000 00000000 00000101
From this table we can see that 10.32.0.5 and 10.32.0.4 have a common prefix length of 30, whereas the common prefix of 10.32.0.5 and 10.32.0.1-10.32.0.3 is 29; consequently, 10.32.0.4 is always preferred. In the situation where we have only three named service containers the client container is allocated 10.32.0.4, and as that has an equal common prefix length of 29 with addresses 10.32.0.1-10.32.0.3 access to them is distributed randomly as expected.
In a nutshell: my analysis was wrong, and a deeper look at RFC3484 (which @awh did) confirms that even when having all clients and server containers in the same host, getaddrinfo()
can cause a single server to preferred over the others, making Weave's current client, DNS-based load-balancing biased.
And ... there's not much we can do about it. @bdentino Sorry we don't have better news, I will close this, as a duplicate of weaveworks/weave#1245 as you originally suspected.
I think we should update the guide with at least a note stating what will happen if you use a glibc resolver. And maybe try to find a curl with a different resolver so it will work as expected.
Weave Loadbalance[1] ... [1] does not balance load
http://daniel.haxx.se/blog/2012/01/03/getaddrinfo-with-round-robin-dns-and-happy-eyeballs/ says that curl compiled with c-ares Does The Right Thing.
My reading of the Musl source suggests it does not apply RFC3484 sorting to IPV4 addresses.
Another workaround would be to use a glib which respects RFC 6724 See https://github.com/weaveworks/weave/issues/1245#issuecomment-158971602
A real-world solution (considering the c-libs used out there) would be to make weavewait tweak gai.conf
in the spawned containers, to guarantee that loadbalancing is uniform for IPv4 but this is really messy.
Is there such a glibc?
I couldn't see how gai.conf
could solve our problem, at least if we aren't continuously updating it. How would you do that?
Is there such a glibc?
I don't know
I couldn't see how gai.conf could solve our problem, at least if we aren't continuously updating it. How would you do that?
We would need to update it. That's why I said it's messy, just like Docker maintains /etc/hosts
:S
Docker only updates /etc/hosts
every time a container starts or stops; I couldn't see how to get anything out of gai.conf
unless you continue to update it and tweak the precedence rules continuously.
Also note the warning that getting processes to re-read gai.conf
"is generally a bad idea"
I'm having issues getting the weave-loadbalance guide to work. Following the latest version of the guides, I believe I've set everything up appropriately, but when I curl the hostname from the
ubuntu
container, all requests resolve to the same ip. Is this normal?It seems that all the containers are visible to weave:
dig
within theubuntu
container sees all hosts, and I can ping each ip address individually:Some relevant status info on each host:
TBH, I'm not sure if there's actually a problem here. It just seems suspicious that it resolves the same IP every time, so I'm hoping someone can shed some light on why this might be happening or whether I've misconfigured something.