slackhq / nebula

A scalable overlay networking tool with a focus on performance, simplicity and security
MIT License
14.52k stars 975 forks source link

Nebula with Digital Ocean floating IPs - replies appear to come from the wrong IP address (perhaps due to multiple IPs on the same NIC?) #546

Open JonTheNiceGuy opened 3 years ago

JonTheNiceGuy commented 3 years ago

I have an issue with Digital Ocean with a Floating IP address, where packets bound for the Internally presented address for the floating IP (an RFC1918 address in the range 10.16.0.0/16) is responded to using the public (non-RFC1918) "real" address for the VPS.

----- Full detail -----

As mentioned on slack, I have the following Nebula environment:

  1. Lighthouse on a Digitial Ocean droplet with a floating IP address
  2. Remote node running Linux

Lighthouse ("VPS") configuration file:

pki:
  ca: /etc/nebula/ca.crt
  cert: /etc/nebula/vps.crt
  key: /etc/nebula/vps.key
static_host_map: {}
lighthouse:
  am_lighthouse: true
  serve_dns: true
  dns:
    host: 203.0.113.2
    port: 5533
listen: 4242
punchy:
  punch: true
  respond: true
tun:
  disabled: false
  dev: nebula1
  drop_local_broadcast: true
  drop_multicast: true
logging:
  level: info
  format: text
firewall:
  outbound:
  - {'port': 'any', 'proto': 'any', 'host': 'any'}
  inbound:
  - {'port': 'any', 'proto': 'any', 'host': 'any'}

Remote node ("debianqnap") configuration file here:

pki:
  ca: /etc/nebula/ca.crt
  cert: /etc/nebula/debianqnap.crt
  key: /etc/nebula/debianqnap.key
static_host_map:
  "203.0.113.2":
    # - "vps.example.org:4242" # DOES NOT WORK - FLOATING IP A RECORD
    - "direct.example.org:4242" # DOES WORK - "REAL" IP A/AAAA RECORD
lighthouse:
  am_lighthouse: false
  interval: 60
  hosts:
  - "203.0.113.2"
punchy:
  punch: true
  respond: true
tun:
  disabled: false
  dev: nebula1
  drop_local_broadcast: true
  drop_multicast: true
logging:
  level: info
  format: text
firewall:
  outbound:
  - {'port': 'any', 'proto': 'any', 'host': 'any'}
  inbound:
  - {'port': 'any', 'proto': 'any', 'host': 'any'}

Note: For sanitization purposes, assume "REAL_IP_AS_0.0.0.0_FORMAT" and "FLOATING_IP_AS_0.0.0.0_FORMAT" are both non-RFC1918 address (e.g. 123.123.123.123), but "FLOATING_INTERNAL_IP_AS_0.0.0.0_FORMAT" is an RFC1918 address (e.g. 10.1.1.1). "REMOTE_IP_AS_0.0.0.0_FORMAT" is the IP address outside the NAT home DSL for the remote node. Fingerprint has also been masked by replacing the bulk of the start of the fingerprint with "1"'s

When the remote node tries to connect to the lighthouse with the floating IP address in the configuration file, using journalctl -xefu nebula, I see in the debianqnap:

Oct 10 23:58:24 debianqnap nebula[23515]: time="2021-10-10T23:58:24+01:00" level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=1102038582 udpAddrs="[<FLOATING_IP_AS_0.0.0.0_FORMAT>:4242]" vpnIp=203.0.113.2
Oct 10 23:58:25 debianqnap nebula[23515]: time="2021-10-10T23:58:25+01:00" level=info msg="Handshake timed out" durationNs=9773343002 handshake="map[stage:1 style:ix_psk0]" initiatorIndex=1102038582 remoteIndex=0 udpAddrs="[<FLOATING_IP_AS_0.0.0.0_FORMAT>:4242]" vpnIp=203.0.113.2

However, if I change this to the "real" IP (avoiding the floating IP), I get this:

Oct 11 00:02:47 debianqnap nebula[28186]: time="2021-10-11T00:02:47+01:00" level=info msg="Handshake message sent" handshake="map[stage:1 style:ix_psk0]" initiatorIndex=2344072597 udpAddrs="[<REAL_IP_AS_0.0.0.0_FORMAT>:4242]" vpnIp=203.0.113.2
Oct 11 00:02:47 debianqnap nebula[28186]: time="2021-10-11T00:02:47+01:00" level=info msg="Handshake message received" certName=vps.example.org durationNs=141259340 fingerprint=111111111111111111111111111111111111111111111111111111111118cd01 handshake="map[stage:2 style:ix_psk0]" initiatorIndex=2344072597 remoteIndex=2344072597 responderIndex=1430280385 sentCachedPackets=1 udpAddr="<REAL_IP_AS_0.0.0.0_FORMAT>:4242" vpnIp=203.0.113.2

With the floating IP as the target, and running tcpdump, I see the following:

root@vps:/etc/nebula# tcpdump -n -i eth0 port 4242
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:08:06.883596 IP <REMOTE_IP_AS_0.0.0.0_FORMAT>.55758 > <FLOATING_INTERNAL_IP_AS_0.0.0.0_FORMAT>.4242: UDP, length 262
23:08:06.914322 IP <REAL_IP_AS_0.0.0.0_FORMAT>.4242 > <REMOTE_IP_AS_0.0.0.0_FORMAT>.55758: UDP, length 307

With the real IP as the target, running tcpdump:

root@vps:/etc/nebula# tcpdump -n -i eth0 port 4242
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
23:12:08.760614 IP <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632 > <REAL_IP_AS_0.0.0.0_FORMAT>.4242: UDP, length 262
23:12:08.775363 IP <REAL_IP_AS_0.0.0.0_FORMAT>.4242 > <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632: UDP, length 307
23:12:08.854769 IP <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632 > <REAL_IP_AS_0.0.0.0_FORMAT>.4242: UDP, length 54
23:12:17.710317 IP <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632 > <REAL_IP_AS_0.0.0.0_FORMAT>.4242: UDP, length 32
23:12:17.710694 IP <REAL_IP_AS_0.0.0.0_FORMAT>.4242 > <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632: UDP, length 32
23:12:17.971892 IP <REAL_IP_AS_0.0.0.0_FORMAT>.4242 > <REMOTE_IP_AS_0.0.0.0_FORMAT>.39632: UDP, length 1

Here's the result of running ip -4 addr on the VPS:

root@vps:/etc/nebula# ip -4 addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    inet <REAL_IP_AS_0.0.0.0_FORMAT>/20 brd x.y.z.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet <FLOATING_INTERNAL_IP_AS_0.0.0.0_FORMAT>/16 brd n.m.255.255 scope global eth0
       valid_lft forever preferred_lft forever

Digital Ocean have replied when I've asked for their advice as follows: Our floating ip addresses are set up in a way that the droplet operates without actually recognizing that the floating ip address exists. It sits in front of this droplet and is connected via an anchor ip that is pre-configured on each droplet. The most common reason that users run into errors with connecting via floating ip is due to the application set up to listen to this floating ip, rather than listening to this anchor ip address.

JonTheNiceGuy commented 3 years ago

After having a chance to look at this, I can see where the issue lies now!

The way the DO Floating IP is implemented means that the packets initiated at the interface exit with the IP based on the next hop route (e.g. if you've got an IP of 192.0.2.1 and an IP of 203.0.113.1, and your default route would use the 203.0.113.1 address, then the packet will have a source of 203.0.113.1).

I suspect that because we're using UDP with Nebula, it's treating each responded packet as a new connection, and thus selects the non-floating IP as the source.

As I don't write code in go, I don't know how Nebula packets are presented in the code, but it feels like maybe the response is just saying "reply out of that interface" rather than "reply using that IP", and so a possible fix (if that is the case) would be to check what the target IP is, and then create the response using that IP?

Of course, I might be entirely wrong, but DO's work-around is to change the default route on your host, from the device-specific IP, to the floating IP's "anchor IP" (basically a NAT-interface IP). This does, indeed, solve the issue in the short-term (if you want to use a floating IP rather than a dedicated IP)... but YMMV.

One final thought for those who really don't want to mess with things like this, is that you could use Policy Based Routing for anything targeting the floating IP and the port for your nebula service... but ugh, I can't even imagine how you'd feel about troubleshooting that at 2AM when Bob from accounting can't access his management portal.

ton31337 commented 3 years ago

Is it possible to explicitly bind to a specific or series of specific IP addresses to test this out?

listen:
  host: 1.1.1.1
nbrownus commented 3 years ago

Nebula isn't selecting the source ip to send from, the kernel is. Listening on the specific ip or adjusting the routes in use for nebula sound like the appropriate fix.

Floating IPs have a few drawbacks as well, this being one, another being that you can apparently attach them to multiple machines which will also break nebula traffic.

JonTheNiceGuy commented 3 years ago

I ended up reverting the routing change I made, because outbound email stopped working at that point! But, for the short-term it worked fine. I'm now addressing each node directly, by using the non-floating IP. Is it worth having a "Troubleshooting FAQ" somewhere, and adding this as an item?

nbrownus commented 3 years ago

Is it worth having a "Troubleshooting FAQ" somewhere, and adding this as an item?

Absolutely! The question really comes down to where we want to host the said FAQ. We will be discussing this internally next week.

Troyhy commented 1 year ago

This floating ip routing bite me as well. As this lighthouse is dedicated machine I added netplan config to route all traffic to go trough this floating ip. After this Nebula worked like a charm.

#/etc/netplan/99-custom-route.yaml
network:
  version: 2
  ethernets:
    eth0:
      routes:
        - to: 0.0.0.0/0
          via: <the actual default route>
          from: <floating-ip-here>
rfletcher commented 5 months ago

Reconfiguring routing fixed me up, too.

I couldn't use the netplan tip, but it led me to the official DigitalOcean docs for routing outbound traffic over a droplet's reserved IP. (Note that "floating IPs" have been renamed to "reserved IPs".) Thanks for the lead, @Troyhy!

https://docs.digitalocean.com/products/networking/reserved-ips/how-to/outbound-traffic/