kube-hetzner / terraform-hcloud-kube-hetzner

Optimized and Maintenance-free Kubernetes on Hetzner Cloud in one command!
MIT License
2.48k stars 375 forks source link

[Bug]: OpenVPN is not connecting to services behind it #979

Closed CroutonDigital closed 1 year ago

CroutonDigital commented 1 year ago

Description

I deploy OpenVPN pod for development team access to k8s services. OpenVPN client can connect to pod use pod ip and port, but if use service ip and service port connection timeout.

I think issue same as:

https://serverfault.com/questions/924773/openvpn-is-not-connecting-to-services-behind-it-iptables

Please help!

Kube.tf file

module "kube-hetzner" {
  providers = {
    hcloud = hcloud
  }
  hcloud_token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token
  source = "kube-hetzner/kube-hetzner/hcloud"
   ssh_port = 2222
  ssh_public_key = file("${path.module}/ssh/k8s-hetzner.pub")
  ssh_private_key = file("${path.module}/ssh/k8s-hetzner")
  network_region = "eu-central"
  network_ipv4_cidr = "10.0.0.0/8"
  cluster_ipv4_cidr = "10.42.0.0/16"
  control_plane_nodepools = [
    {
      name        = "control-plane-fsn1",
      server_type = "cpx11",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 1

      # Enable automatic backups via Hetzner (default: false)
      # backups = true
    },
    {
      name        = "control-plane-nbg1",
      server_type = "cpx11",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 1

      # Enable automatic backups via Hetzner (default: false)
      # backups = true
    },
#    {
#      name        = "control-plane-hel1",
#      server_type = "cpx11",
#      location    = "hel1",
#      labels      = [],
#      taints      = [],
#      count       = 1
#
#      # Enable automatic backups via Hetzner (default: false)
#      # backups = true
#    }
  ]

  agent_nodepools = [
    {
      name        = "agent-small",
      server_type = "cpx11",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 2

      # Enable automatic backups via Hetzner (default: false)
      # backups = true
    },
    {
      name        = "agent-large",
      server_type = "cpx21",
      location    = "nbg1",
      labels      = [],
      taints      = [],
      count       = 0

      # Enable automatic backups via Hetzner (default: false)
      # backups = true
    },
    {
      name        = "storage",
      server_type = "cpx21",
      location    = "fsn1",
      # Fully optional, just a demo.
      labels      = [
        "node.kubernetes.io/server-usage=storage"
      ],
      taints      = [],
      count       = 1

    },
    {
      name        = "egress",
      server_type = "cpx11",
      location    = "fsn1",
      labels = [
        "node.kubernetes.io/role=egress"
      ],
      taints = [
        "node.kubernetes.io/role=egress:NoSchedule"
      ],
      floating_ip = true
      count = 0
    },
    {
      name        = "agent-arm-small",
      server_type = "cax11",
      location    = "fsn1",
      labels      = [],
      taints      = [],
      count       = 0
    }
  ]
  load_balancer_type     = "lb11"
  load_balancer_location = "fsn1"
  ingress_controller = "traefik"

  traefik_additional_options = ["--log.level=DEBUG"]
  initial_k3s_channel = "stable"
  cluster_name = "k3s"

  restrict_outbound_traffic = true

   extra_firewall_rules = [
     {
       description = "ELK logs"
       direction       = "out"
       protocol        = "tcp"
       port            = "9200"
       source_ips      = [] # Won't be used for this rule
       destination_ips = ["0.0.0.0/0", "::/0"]
     },
     {
       description = "To Allow ArgoCD access to resources via SSH"
       direction       = "out"
       protocol        = "tcp"
       port            = "22"
       source_ips      = [] # Won't be used for this rule
       destination_ips = ["0.0.0.0/0", "::/0"]
     }
   ]

   cni_plugin = "cilium"
   enable_cert_manager = false
   dns_servers = ["1.1.1.1", "8.8.8.8", "9.9.9.9"]

  cilium_values = <<EOT
ipam:
  mode: kubernetes
k8s:
  requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
  enabled: true
loadBalancer:
  acceleration: native
bpf:
  masquerade: true
egressGateway:
  enabled: true
MTU: 1450
EOT

 traefik_values = <<EOT
deployment:
  replicas: 1
globalArguments: []
service:
  enabled: true
  type: LoadBalancer
  annotations:
    "load-balancer.hetzner.cloud/name": "k3s"
    "load-balancer.hetzner.cloud/use-private-ip": "true"
    "load-balancer.hetzner.cloud/disable-private-ingress": "true"
    "load-balancer.hetzner.cloud/location": "nbg1"
    "load-balancer.hetzner.cloud/type": "lb11"
    "load-balancer.hetzner.cloud/uses-proxyprotocol": "true"

ports:
  web:
    redirectTo: websecure

    proxyProtocol:
      trustedIPs:
        - 127.0.0.1/32
        - 10.0.0.0/8
    forwardedHeaders:
      trustedIPs:
        - 127.0.0.1/32
        - 10.0.0.0/8
  websecure:
    proxyProtocol:
      trustedIPs:
        - 127.0.0.1/32
        - 10.0.0.0/8
    forwardedHeaders:
      trustedIPs:
        - 127.0.0.1/32
        - 10.0.0.0/8

tlsOptions: {}
tlsStore: {}

certResolvers:
  letsencrypt:
    email: maxim.dogonov@g.estchange.io
    tlsChallenge: true
    httpChallenge:
      entryPoint: "web"
    #     # It has to match the path with a persistent volume
    storage: /data/acme.json

  EOT
} 

provider "hcloud" {
  token = var.hcloud_token != "" ? var.hcloud_token : local.hcloud_token
}

terraform {
  required_version = ">= 1.3.3"
  required_providers {
    hcloud = {
      source  = "hetznercloud/hcloud"
      version = ">= 1.39.0"
    }
  }
}

Screenshots

No response

Platform

Linux

CroutonDigital commented 1 year ago

When I send packet to service use command curl http://10.43.26.206:9090

08:09:43.557956 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [SEW], cksum 0xc2fe (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418076138 ecr 0,sackOK,eol], length 0
08:09:44.560509 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xbfd5 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418077139 ecr 0,sackOK,eol], length 0
08:09:45.561348 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xbbec (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418078140 ecr 0,sackOK,eol], length 0
08:09:46.561143 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb803 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418079141 ecr 0,sackOK,eol], length 0
08:09:47.563019 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb41a (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418080142 ecr 0,sackOK,eol], length 0
08:09:48.562645 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57615 > 10.43.26.206.9090: Flags [S], cksum 0xb031 (correct), seq 1239973897, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 418081143 ecr 0,sackOK,eol], length 0

when I send packed to pod ip all ok: curl http://10.42.7.207:9090

 08:11:27.416767 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 64)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [SEW], cksum 0x91e9 (correct), seq 4180628869, win 65535, options [mss 1356,nop,wscale 6,nop,nop,TS val 2962259299 ecr 0,sackOK,eol], length 0
08:11:27.416865 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    10.42.7.207.9090 > 10.42.7.42.57666: Flags [S.E], cksum 0x237b (incorrect -> 0x7ddc), seq 4125798800, ack 4180628870, win 64704, options [mss 1360,sackOK,TS val 271347987 ecr 2962259299,nop,wscale 7], length 0
08:11:27.472540 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0xa104 (correct), ack 1, win 2058, options [nop,nop,TS val 2962259354 ecr 271347987], length 0
08:11:27.567033 IP (tos 0x2,ECT(0), ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 132)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [P.], cksum 0xe117 (correct), seq 1:81, ack 1, win 2058, options [nop,nop,TS val 2962259354 ecr 271347987], length 80
08:11:27.567096 IP (tos 0x0, ttl 63, id 8131, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.207.9090 > 10.42.7.42.57666: Flags [.], cksum 0x2373 (incorrect -> 0xa62e), ack 81, win 505, options [nop,nop,TS val 271348138 ecr 2962259354], length 0
08:11:27.570339 IP (tos 0x2,ECT(0), ttl 63, id 8132, offset 0, flags [DF], proto TCP (6), length 362)
    10.42.7.207.9090 > 10.42.7.42.57666: Flags [P.], cksum 0x24a9 (incorrect -> 0x4738), seq 1:311, ack 81, win 505, options [nop,nop,TS val 271348141 ecr 2962259354], length 310
08:11:27.624034 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0x9e50 (correct), ack 311, win 2053, options [nop,nop,TS val 2962259507 ecr 271348141], length 0
08:11:27.719946 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [F.], cksum 0x9e4f (correct), seq 81, ack 311, win 2053, options [nop,nop,TS val 2962259507 ecr 271348141], length 0
08:11:27.720080 IP (tos 0x0, ttl 63, id 8133, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.207.9090 > 10.42.7.42.57666: Flags [F.], cksum 0x2373 (incorrect -> 0xa3c4), seq 311, ack 82, win 505, options [nop,nop,TS val 271348291 ecr 2962259507], length 0
08:11:27.776932 IP (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.42.7.42.57666 > 10.42.7.207.9090: Flags [.], cksum 0x9d20 (correct), ack 312, win 2053, options [nop,nop,TS val 2962259659 ecr 271348291], length 0
CroutonDigital commented 1 year ago

when I try connect inside OpenVPN pod, curl http://10.43.26.206:9090

all ok

CroutonDigital commented 1 year ago
Screenshot 2023-09-13 at 18 56 14 Screenshot 2023-09-13 at 18 55 47
CroutonDigital commented 1 year ago

First screen request to pod, second to service k8s

M4t7e commented 1 year ago

Hey @CroutonDigital, I was not able to get a full picture from the information you provided and maybe I went the wrong path trying to follow your description. Can you try to be as precise as possible about the scenarios, especially from where you try to reach what?

Here is what I understood so far, but please read it carefully and correct/add missing information:

when I try connect inside OpenVPN pod, curl http://10.43.26.206:9090/ all ok

OVPN Pod > Service (Web-App) > Web-App Pod: OK

OpenVPN client can connect to pod use pod ip and port [...]

OVPN Client > OVPN Pod > Web-App Pod: OK

[...] but if use service ip and service port connection timeout

OVPN Client > OVPN Pod > Service (Web-App) > Web-App Pod: Fail

Please answer the following points:

CroutonDigital commented 1 year ago

Route path:

Not Worked: OVPN CLIENT (OS X with VPN IP 10.8.0.0/24 range) > LOAD BALANCER TCP 1194 > k8s > OVPN POD (10.42.7.42) > GRPC BACKEND SERVICE (10.43.26.206:9090) > GRPC BACKEND POD (10.42.7.207:9090)

Worked: OVPN CLIENT (OS X with VPN IP 10.8.0.0/24 range) > LOAD BALANCER TCP 1194 > k8s > OVPN POD (10.42.7.42) > GRPC BACKEND POD (10.42.7.207:9090)

OVPN Client: 10.8.0.0/24 range (MASKED IP OVPN POD 10.42.7.42 use SNAT RULE IPTABLES) OVPN Pod: 10.42.7.42 Node (OVPN): 10.1.0.101 Web-App Pod: 10.42.7.207 Service (Web-App): 10.43.26.206

vpn.pcap.zip Trace is taked on OVPN pod

inside OVPN pod entrypoint.sh:

#!/bin/bash

set -e

mkdir -p /dev/net
mknod /dev/net/tun c 10 200
chmod 600 /dev/net/tun

iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

/bin/sleep 2
openvpn --config /opt/openvpn/server.conf

that OVPN pod use on current GKE cluster and worked fine

CroutonDigital commented 1 year ago

service definition

resource "kubernetes_service" "binance_test_futures_client" {
  metadata {
    name = "binance-test-futures-client"
    namespace = local.namespace_name
  }
  spec {
    port {
      port        = 9090
      target_port = 9090
      protocol    = "TCP"
    }
    selector = {
      app = kubernetes_daemonset.binance_test_futures_client.metadata[0].labels.app
    }
  }
}
M4t7e commented 1 year ago

Thx @CroutonDigital!

I can't see the that the service LB is even trying to act on the masqueraded packet destined to the cluster IP. Can you please try it again with the following cilium configuration?

ipam:
  mode: kubernetes
k8s:
  requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
  enabled: true
loadBalancer:
  acceleration: native
bpf:
  masquerade: true
socketLB:
  hostNamespaceOnly: true
egressGateway:
  enabled: true
MTU: 1450

This will skip the socket LB for services when inside a pod namespace, in favor of the service LB at the pod interface (tc load balancer).

CroutonDigital commented 1 year ago

I change cilium values on kube.tf

  cilium_values = <<EOT
ipam:
  mode: kubernetes
k8s:
  requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
  enabled: true
loadBalancer:
  acceleration: native
bpf:
  masquerade: true
socketLB:
  hostNamespaceOnly: true
egressGateway:
  enabled: true
MTU: 1450
EOT

same issue, also I have additional VM attached to the hetwork and same issue with service communicate.

I take dump from OVPN POD: dump.pcap.zip

OVPN Client utun3 inet 10.8.0.6 --> 10.8.0.5 OVPN Pod: eth0 10.42.3.107 | tun0: 10.8.0.1
Node (OVPN): 10.1.0.102 Web-App Pod: 10.42.2.158 Service (Web-App): 10.43.26.206

M4t7e commented 1 year ago

Hey @CroutonDigital, that's strange. Are you sure the new configuration is applied successfully? You can enforce it with the following command: kubectl -n kube-system rollout restart daemonset/cilium

Here is a new version that explicitly allows external access to Cluster IPs (bpf.lbExternalClusterIP: true). Imho this should not be necessary as the IP should be SNATed with the Pod IP, but just in case it is smarter than I thought.

ipam:
  mode: kubernetes
k8s:
  requireIPv4PodCIDR: true
kubeProxyReplacement: true
routingMode: native
ipv4NativeRoutingCIDR: "10.0.0.0/8"
endpointRoutes:
  enabled: true
loadBalancer:
  acceleration: native
bpf:
  masquerade: true
  lbExternalClusterIP: true
socketLB:
  hostNamespaceOnly: true
egressGateway:
  enabled: true
MTU: 1450

also I have additional VM attached to the hetwork and same issue with service communicate.

This will not work out of the box because the underlying network does not know about where to route the service CIDR traffic to. Cluster IPs do not belong to a single node as they are only virtually known by the service load balancers inside of the nodes.

Here some more points you can try/verify:

CroutonDigital commented 1 year ago

@M4t7e it seems worked after kubectl -n kube-system rollout restart daemonset/cilium )))))

I will test today, I write feedback for you!

CroutonDigital commented 1 year ago

Yes, All works fine.

Thank you!