projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
5.89k stars 1.31k forks source link

Option to detect Calico node ip address from external source #5344

Open mattirantakomi opened 2 years ago

mattirantakomi commented 2 years ago

It is a common practice in many clouds and virtual environments that virtual machine public ip address is not directly assigned to any interface in virtual machine. Virtual machine will only have private ip address assigned and there is some kind of one-to-one NAT/firewall and private network between public ip address and virtual machine private ip address. It leads to a situation where node ip address is detected incorrectly to be private network ip address. That private ip address is then used as a Wireguard endpoint address. This is an issue as Wireguard requires public ip address to be able to receive connections.

I have set up few k0s Ubuntu 20.04.3 LTS nodes running on Proxmox (Debian Bullseye) QEMU virtual machines. Only one virtual machine has public ip address which is NAT'ed to virtual machine private ip address. Virtual machine has only private ip address assigned to its network interface. This machine would be able to listen on :UDP/51820 to receive new Wireguard connection if its public ip address was detected correctly and then used as Wireguard endpoint address. At the moment it is not working as Calico node detects private ip address as node ip address which is then used as Wireguard endpoint address.

Other virtual machines are behind NAT'ed 4G connections without possibility to listen incoming Wireguard connections on public ip address. If that virtual machine with public ip address could listen on :UDP/51820 to receive new Wireguard connections the other virtual machines behind NAT'ed 4G will be able to connect to it and Wireguard connections between nodes will work as they should.

Expected Behavior

Calico detects node real public ip address from external source when it is configured to do so.

Current Behavior

Private network ip address is detected as node ip address.

Possible Solution

External method for IP_AUTODETECTION_METHOD which allows to specify external URL which could be used to check node public ip address (curl -s https://ip.jes.fi/):

IP_AUTODETECTION_METHOD=external=https://ip.jes.fi/,first-found

Method should also test that detected public ip address is able to receive new incoming Wireguard connections on UDP port 51820. If it doesn't work, it should fall back to secondary ip detection method, eg. first-found.

Steps to Reproduce (for bugs)

Context

Wireguard is not able to establish connection between nodes because any node doesn't have public ip address configured as Calico node ip address / Wireguard endpoint.

Your Environment

caseydavenport commented 2 years ago

Generally, cluster nodes communicate with each other over their private IPs and not their public ones - can you expand on the scenario you're implementing that has them communicating via public IPs?

At a minimum, this sounds like it would be resolved by this enhancement request: https://github.com/projectcalico/calico/issues/5154 - yes?

It's also worth stating that Calico today already allows explicit configuration of each node's address using the calico Node's API or an annotation on the Kubernetes node. That might work as a stopgap.

mattirantakomi commented 2 years ago

As you say, it is true that cluster nodes communicate with each other over their private IPs. The problem is that when nodes are not in the same LAN/VPC (they run on different locations/networks) they are not able to communicate over private IPs before Wireguard tunnels between nodes are established. If nodes' public IP address is not correctly detected, nodes are unable to establish Wireguard tunnels between each other because they try to connect to other Wireguard peers (other nodes) using wrongly detected node IP addresses (private IPs in this case) which naturally are not routed over the internet.

As I wrote in my opening post, everything works fine if at least one node has public IP address detected by Calico so the other nodes can then connect to it through the internet to establish Wireguard tunnels to make connectivity between nodes over private IPs possible.

I'm not using BGP but it sounds like that https://github.com/projectcalico/calico/issues/5154 might resolve my issue as it uses ExternalIP as node ip which would be then detected correctly by Calico.

I know that it is also possible to specify Calido node IP address manually by setting IP environment variable but that option cannot be used in this case because calico-node is running as daemonset thus IP env cannot be set individually for every particular node.