Open dpw opened 9 years ago
When I run weave I don't see the peers come in on the Docker bridge address. Maybe you are running Docker in some nonstandard configuration?
what is the point of other peers trying to connect to those addresses, even with the normalized port number?
Consider an existing weave network of A and B. Now a new peer C comes along. Due to firewall restrictions, it can connect to A but not B. But B can connect to C. With the above strategy, B will learn about the inbound connection on A from C (through gossip from A) and establish a connection to C.
When I run weave I don't see the peers come in on the Docker bridge address. Maybe you are running Docker in some nonstandard configuration?
Not as far as I know. Weave is straight from master, with one host being my laptop host OS (fedora), and the two others being VMs (one fedora, one ubuntu). All on docker 1.5, connecting over a linux bridge.
Correction: I have docker 1.4.1 on the host host, the VMs have 1.5. And the host host is the only one shows the docker bridge IPs on incoming connections. I haven't established whether these things are connected.
Ok, I understand how the source addresses of incoming connections on my host are rewritten with the IP address of the docker hub. It's due to the interaction of docker's iptable rules with the libvirt iptable rules that do NATting for the VMs.
I have my libvirt VMs configured to use a NATted bridge on 192.168.122.0/24, which involves these iptables rules:
# iptables -t nat -L -n
[...]
Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
[...]
MASQUERADE tcp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
MASQUERADE udp -- 192.168.122.0/24 !192.168.122.0/24 masq ports: 1024-65535
[...]
Plus I have the following rules introduced by docker for the weave container:
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 ADDRTYPE match dst-type LOCAL
[...]
Chain DOCKER (2 references)
target prot opt source destination
DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:6783 to:172.17.0.35:6783
DNAT udp -- 0.0.0.0/0 0.0.0.0/0 udp dpt:6783 to:172.17.0.35:6783
When I do weave launch 192.168.122.1
in a VM, and tcpdump on the virbr0 bridge, I see the TCP connection initiated with a packet like:
11:25:33.916115 IP 192.168.122.108.59976 > 192.168.122.1.6783: Flags [S], seq 3506001454, win 29200, options [mss 1460,sackOK,TS val 439645 ecr 0,nop,wscale 7], length 0
The rules on the DOCKER chain then rewrite the destination address to 172.17.0.35.
Then the POSTROUTING rules run, and now match the packet. From iptables-extensions(8) "Masquerading is equivalent to specifying a [SNAT] mapping to the IP address of the interface the packet is going out", which in this case is docker0. So the source address is rewritten to 172.17.42.1. And so the packet on the docker0 bridge becomes:
11:39:06.082342 IP 172.17.42.1.45969 > 172.17.0.35.6783: Flags [S], seq 2825954753, win 29200, options [mss 1460,sackOK,TS val 642686 ecr 0,nop,wscale 7], length 0
@dpw what, if anything, is left to do here?
At the very least, it would be good to have a full account of weave's connection behaviour and its rationale somewhere, because I am struggling to derive it from the code and github issues.
I second this sentiment - I'll have a stab at producing such an account.
So, the reason I had left this open was as a reminder to decide whether the crazy address rewriting that caused my confusion is a libvirt issue or a docker issue, and to report it, possibly with a fix. But that's a fairly low priority right now. So @awh can take it, otherwise I would have suggested iceboxing it.
448 says:
Consider the case where weave is launched on three hosts, A, B, and C, with B and C told to connect to A. For instances, with A having IP address 19.168.122.1:
Once B and C have successfully connected to A, the topology section of the
weave status
output on any of the hosts will looks something like:In that case,
weave status
was run on B, and the peers appear in the order B, A, C.We see that B and C are connected to 192.168.122.1, as expected. And A shows two incoming connections from 172.17.42.1, which is the IP address of the docker bridge. So the IP addresses of B and C are being disguised by docker's connection proxying. Fair enough.
Now, due to #451, B and C will try to connect to 172.17.42.1, but on weave's standard port number, rather than the port numbers reported by A. So they try to connect to themselves, and doing
docker logs weave
on B or C reveals the following messages, repeated every few minutes:weave 2015/03/30 16:50:04.518189 ->[172.17.42.1:36367] connection shutting down due to error during handshake: Cannot connect to ourself
It's while this is not a terrible outcome, it does clutter the logs, and it is unclear to me what the rationale is.
It seems to me that ideally, A would find out the true IP addresses of B and C, and they would connect to each other. But I suppose this would mean running the router container with
--net=host
? So we rule that possibility out.But if we know that the weave router cannot trust the remote addresses reported for incoming connections, then what is the point of other peers trying to connect to those addresses, even with the normalized port number? I must be missing something.
At the very least, it would be good to have a full account of weave's connection behaviour and its rationale somewhere, because I am struggling to derive it from the code and github issues.