weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.62k stars 670 forks source link

2 hosts behind NAT, weave doesn't seem to be using an intermediary to connect them #1744

Closed faddat closed 8 years ago

faddat commented 8 years ago

https://gist.github.com/faddat/34528c0a843656b67495

Please let me know if you need more info....basically I'm in Vietnam on a NAT'd pc that cannot connect to another NAT'd PC in France, and the 142* addresses are in Houston, not nat'd.

dpw commented 8 years ago

Hi:

faddat commented 8 years ago

Added the logs to the gist: https://gist.github.com/faddat/34528c0a843656b67495

VN: HHT* Jacob-PC Ubuntu (the unpingable machine was ubuntu)

FR: Androgeek Toshiba

When I was saying cannot connect, I meant couldn't ping one another. The machine in FR did get an IP via weave expose successfully, but then was unreachable to nodes other than the US*'s.

awh commented 8 years ago

When I was saying cannot connect, I meant couldn't ping one another.

Can you give us more detail on the exact steps you have taken on the various hosts, both in terms of establishing the weave network and then how you're conducting your tests?

The machine in FR did get an IP via weave expose successfully

Would I be right then in thinking that your tests involve doing weave expose on a pair of machines (e.g. ubuntu and androgeek) and then pinging from one of them the remotely exposed weave IP of the other?

but then was unreachable to nodes other than the US*'s.

So you can ping androgeek's exposed weave IP from us1-4? This is confusing, because from the weave status peers output you posted it looks to us like the connections between androgeek and the US servers are not fully established (specifically, it looks like the UDP heartbeats from the US servers to androgeek are not getting through, which is why those connections are listed as pending). Is it possible that the NAT firewall in France is blocking inbound UDP packets?

faddat commented 8 years ago

@awh to be certain of my answers, I will recreate this setup tomorrow. If you've got any other questions, add them here and I'll try and give you the most detailed answer possible.

faddat commented 8 years ago

Attempting now. Sorry for tardiness. Testing this is rather a logistical nightmare... On Dec 2, 2015 7:38 PM, "Adam Harrison" notifications@github.com wrote:

When I was saying cannot connect, I meant couldn't ping one another.

Can you give us more detail on the exact steps you have taken on the various hosts, both in terms of establishing the weave network and then how you're conducting your tests?

The machine in FR did get an IP via weave expose successfully

Would I be right then in thinking that your tests involve doing weave expose on a pair of machines (e.g. ubuntu and androgeek) and then pinging from one of them the remotely exposed weave IP of the other?

but then was unreachable to nodes other than the US*'s.

So you can ping androgeek's exposed weave IP from us1-4? This is confusing, because from the weave status peers output you posted it looks to us like the connections between androgeek and the US servers are not fully established (specifically, it looks like the UDP heartbeats from the US servers to androgeek are not getting through, which is why those connections are listed as pending). Is it possible that the NAT firewall in France is blocking inbound UDP packets?

— Reply to this email directly or view it on GitHub https://github.com/weaveworks/weave/issues/1744#issuecomment-161279619.

bboreham commented 8 years ago

@faddat did you get any further with this?

faddat commented 8 years ago

[faddat@antergos Downloads]$ sudo weave status

   Version: 1.3.1

   Service: router
  Protocol: weave 1..2
      Name: e2:dc:f1:ab:14:99(antergos)
Encryption: disabled

PeerDiscovery: enabled Targets: 1 Connections: 3 (2 established, 1 connecting) Peers: 4 (with 8 established, 2 pending connections)

   Service: ipam
 Consensus: achieved
     Range: 172.16.0.0-172.16.255.255

DefaultSubnet: 172.16.0.0/16

   Service: dns
    Domain: weave.local.
       TTL: 1
   Entries: 4

   Service: proxy
   Address: unix:///var/run/weave/weave.sock

[faddat@antergos Downloads]$ Service: dns bash: Service:: command not found [faddat@antergos Downloads]$ weave status Cannot connect to the Docker daemon. Is the docker daemon running on this host? [faddat@antergos Downloads]$ sudo weave status [sudo] password for faddat:

   Version: 1.3.1

   Service: router
  Protocol: weave 1..2
      Name: e2:dc:f1:ab:14:99(antergos)
Encryption: disabled

PeerDiscovery: enabled Targets: 1 Connections: 3 (2 established, 1 connecting) Peers: 4 (with 6 established, 2 pending connections)

   Service: ipam
 Consensus: achieved
     Range: 172.16.0.0-172.16.255.255

DefaultSubnet: 172.16.0.0/16

   Service: dns
    Domain: weave.local.
       TTL: 1
   Entries: 2

   Service: proxy
   Address: unix:///var/run/weave/weave.sock

[faddat@antergos Downloads]$ weave report Cannot connect to the Docker daemon. Is the docker daemon running on this host? [faddat@antergos Downloads]$ sudo weave report { "Version": "1.3.1", "Router": { "Protocol": "weave", "ProtocolMinVersion": 1, "ProtocolMaxVersion": 2, "Encryption": false, "PeerDiscovery": true, "Name": "e2:dc:f1:ab:14:99", "NickName": "antergos", "Port": 6783, "Peers": [ { "Name": "e2:dc:f1:ab:14:99", "NickName": "antergos", "UID": 7264406873755801132, "ShortID": 1418, "Version": 2025, "Connections": [ { "Name": "02:ec:03:f7:ae:45", "NickName": "us1", "Address": "142.0.199.20:6783", "Outbound": true, "Established": true }, { "Name": "1e:3a:ca:d2:7f:3c", "NickName": "eniac-v2", "Address": "61.3.124.126:6783", "Outbound": true, "Established": true } ] }, { "Name": "02:ec:03:f7:ae:45", "NickName": "us1", "UID": 14709582172703342217, "ShortID": 4083, "Version": 70524, "Connections": [ { "Name": "1e:3a:ca:d2:7f:3c", "NickName": "eniac-v2", "Address": "61.3.124.126:6783", "Outbound": true, "Established": true }, { "Name": "e2:dc:f1:ab:14:99", "NickName": "antergos", "Address": "117.6.161.201:62113", "Outbound": false, "Established": true } ] }, { "Name": "1e:3a:ca:d2:7f:3c", "NickName": "eniac-v2", "UID": 11841231527084324499, "ShortID": 950, "Version": 1378, "Connections": [ { "Name": "02:ec:03:f7:ae:45", "NickName": "us1", "Address": "142.0.199.20:48338", "Outbound": false, "Established": true }, { "Name": "e2:dc:f1:ab:14:99", "NickName": "antergos", "Address": "117.6.161.201:52577", "Outbound": false, "Established": true } ] } ], "UnicastRoutes": [ { "Dest": "e2:dc:f1:ab:14:99", "Via": "00:00:00:00:00:00" }, { "Dest": "1e:3a:ca:d2:7f:3c", "Via": "1e:3a:ca:d2:7f:3c" }, { "Dest": "02:ec:03:f7:ae:45", "Via": "02:ec:03:f7:ae:45" } ], "BroadcastRoutes": [ { "Source": "02:ec:03:f7:ae:45", "Via": null }, { "Source": "1e:3a:ca:d2:7f:3c", "Via": null }, { "Source": "e2:dc:f1:ab:14:99", "Via": [ "02:ec:03:f7:ae:45", "1e:3a:ca:d2:7f:3c" ] } ], "Connections": [ { "Address": "142.0.199.20:6783", "Outbound": true, "State": "established", "Info": "sleeve 02:ec:03:f7:ae:45(us1)" }, { "Address": "61.3.124.126:6783", "Outbound": true, "State": "established", "Info": "sleeve 1e:3a:ca:d2:7f:3c(eniac-v2)" }, { "Address": "142.0.199.44:6783", "Outbound": true, "State": "connecting", "Info": "" } ], "Targets": [ "142.0.199.20" ], "OverlayDiagnostics": { "fastdp": { "Vports": [ { "ID": 0, "Name": "weave", "TypeName": "internal" }, { "ID": 1, "Name": "vxlan-6784", "TypeName": "vxlan" } ], "Flows": [] }, "sleeve": null }, "Interface": "weave (via ODP)", "CaptureStats": null, "MACs": [ { "Mac": "1e:3a:ca:d2:7f:3c", "Name": "1e:3a:ca:d2:7f:3c", "NickName": "eniac-v2", "LastSeen": "2015-12-14T19:36:18.236566089Z" }, { "Mac": "e2:dc:f1:ab:14:99", "Name": "e2:dc:f1:ab:14:99", "NickName": "antergos", "LastSeen": "2015-12-14T19:35:30.617408637Z" } ] }, "IPAM": { "Paxos": null, "Range": "172.16.0.0-172.16.255.255", "DefaultSubnet": "172.16.0.0/16", "Entries": [ { "Token": "172.16.0.0", "Peer": "02:ec:03:f7:ae:45", "Version": 2 }, { "Token": "172.16.64.0", "Peer": "6a:3b:35:a0:48:70", "Version": 1 }, { "Token": "172.16.128.0", "Peer": "02:ec:03:f7:ae:45", "Version": 5 }, { "Token": "172.16.160.0", "Peer": "1e:3a:ca:d2:7f:3c", "Version": 1 }, { "Token": "172.16.192.0", "Peer": "7a:2c:f7:0a:a2:9f", "Version": 1 }, { "Token": "172.16.255.255", "Peer": "02:ec:03:f7:ae:45", "Version": 0 } ], "PendingClaims": null, "PendingAllocates": null }, "DNS": { "Domain": "weave.local.", "Address": "172.17.0.1:53", "TTL": 1, "Entries": [ { "Hostname": "scope.weave.local.", "Origin": "1e:3a:ca:d2:7f:3c", "ContainerID": "7d1930f43ccd07125e5ef1a28cc12725bbc82227db512f771e1b9fe441c3e201", "Address": "172.16.160.0", "Version": 0, "Tombstone": 0 }, { "Hostname": "scope.weave.local.", "Origin": "1e:3a:ca:d2:7f:3c", "ContainerID": "7d1930f43ccd07125e5ef1a28cc12725bbc82227db512f771e1b9fe441c3e201", "Address": "192.168.1.2", "Version": 0, "Tombstone": 0 } ] }

The saga continues. I would summarize my post differently now: Weave can get janky when doing deployments outside those commonly seen in a datacenter.

faddat commented 8 years ago

gist next time.... want logs?

awh commented 8 years ago

gist next time.... want logs?

Could we start first with some answers to the questions I asked earlier? It's practically impossible to debug without any idea of what you're actually doing :smile: Specifically:

Can you give us more detail on the exact steps you have taken on the various hosts, both in terms of establishing the weave network and then how you're conducting your tests?

And also from this comment:

So you can ping androgeek's exposed weave IP from us1-4? This is confusing, because from the weave status peers output you posted it looks to us like the connections between androgeek and the US servers are not fully established (specifically, it looks like the UDP heartbeats from the US servers to androgeek are not getting through, which is why those connections are listed as pending). Is it possible that the NAT firewall in France is blocking inbound UDP packets?

Weave isn't completely magic - if your firewall is blocking UDP it won't work.

faddat commented 8 years ago

Sure thing-- okay so what we were doing with this was messing around. I'm hopng that I can use weave to set up virtual networks enabling an openstack / MAAS environment for bootstrapping metal.

On that day, I was using a weave network between my and some coworkers machines to enable (well, we'd hoped) remote PXE booting. So like this: An RPI runs Ubuntu MAAS at my office in hanoi, and people wherever boot virtual / physical machines through its DHCP and a network bridged to theirs. What we're doing on a company scale is bringing clouds to users, no matter where the users might be on the planet. In our long-term vision, weave enables us to create a diffuse, global compute grid with >10,000 ARM nodes scattered algorithmically by population density..... so in terms of our network topology, it's likely to get more strange. :)

.....and I thought that it was complete magic ;). The french fellow said it was not, but you never can truly tell (though with weave, I suppose you could.... ;D)