projectcalico / calico

Cloud native networking and network security
https://docs.tigera.io/calico/latest/about/
Apache License 2.0
6.02k stars 1.34k forks source link

ebpf: host networked client cannot access external IP via a service without CTLB #8545

Closed tomastigera closed 1 month ago

tomastigera commented 8 months ago
          > The issue is considered fixed since 3.26 and the fix is part of the default setting in 3.27

@tomastigera I tested version 3.27 to see if the commit could resolve the Istio compatibility issue.

with this configuration.

Set felixconfiguration options to bpfConnectTimeLoadBalancing=Disabled and bpfHostNetworkedNATWithoutCTLB=Enabled.

However, I encountered a connection timeout when the client pod was using host networking (e.g., a calico-controller node) and the destination was a service with an external IP (e.g., an API server installed outside the pod network).

This is endpoint which have problem.

apiVersion: v1 kind: Endpoints metadata: name: kubernetes namespace: default subsets:

  • addresses:
    • ip: 10.10.66.11
    • ip: 10.10.66.12
    • ip: 10.10.66.13 ports:
    • name: https port: 6443 protocol: TCP

In another case ( host-network ->svc , pod->svc of pod ) is ok.

Originally posted by @zoftdev in https://github.com/projectcalico/calico/issues/4509#issuecomment-1953439604

tomastigera commented 8 months ago

However, I encountered a connection timeout when the client pod was using host networking (e.g., a calico-controller node) and the destination was a service with an external IP (e.g., an API server installed outside the pod network).

You mean you were not able to connect to the external ep via a service with Istio? Or you were just not able to connect to an external ep via the service w/o Istio?

In another case ( host-network ->svc , pod->svc of pod ) is ok.

You mean that with Istio (or without?) you can make connection from host-net->svc to a pod endpoint? And you can as well connect with Istio from a pod to another pod via a service?

Afaict Istio is not really meant for host networked clients, is it? And I cannot quite see a difference (fromthe point of the dataplane) between connecting via a service to a pod ep or an external ep. In either case, we translate the dest IP and let Linux route it.

Would you be able to share iptables dump (to see what rules Istio injected - if any) and routing from the host?

sfudeus commented 8 months ago

I might have a similar issue, trying the same configuration from #4509 with bpfConnectTimeLoadBalancing=Disabled and bpfHostNetworkedNATWithoutCTLB=Enabled.

Since then, I have some hosts and pods in the hostNetwork which cannot reach the clusterIP of the apiserver anymore (100.72.0.1). But this is unrelated to Istio (we do have Istio, but neither enabled for the hostNetworked pod nor the apiserver. Any data I can share? Detail: This does not affect all hosts - in fact it affects exactly those which do have a pod being endpoint for the clusterIP, i.e. my master nodes with apiservers on them.

edit: using 3.27.2 edit2: setting bpfConnectTimeLoadBalancing to TCP solves this problem functionally. before trying the new config, we were using the feature gate approach BPFConnectTimeLoadBalancingWorkaround=udp

tomastigera commented 8 months ago

edit2: setting bpfConnectTimeLoadBalancing to TCP solves this problem functionally. before trying the new config, we were using the feature gate approach BPFConnectTimeLoadBalancingWorkaround=udp

These two things are equivalent. bpfConnectTimeLoadBalancing=Disabled turns off the connect time LB for TCP as well. So there seems to be an issue which would probably still manifest itself with UDP.

Is the kube-apiserver host-networked or not?

sfudeus commented 8 months ago

Is the kube-apiserver host-networked or not?

Our apiserver pods are host-networked static pods.

tomastigera commented 8 months ago

@sfudeus that is a real issue and I will track it separately as it looks different to the original issue of this ticket.

bh-tt commented 6 months ago

@tomastigera I seem to be dealing with this error as well (neither hostnetwork pods nor nodes can reach the apiserver on the Service (10.96.0.1:443), but have no trouble reaching it on the actual master node IPs:6443).

Setup:

The hostnetwork program with problems is the istio CNI plugin, which simply goes to 10.96.0.1:443 in its kubeconfig and has no way to permanently set it to anything else like the HA apiserver address. This results in new pods hanging forever as the istio CNI cannot obtain the information it needs to configure the pod network.

If I set bpfConnectTimeLoadBalancing back to TCP and bpfHostNetworkedNATWithoutCTLB to Disabled it works again, but I need it to be on as I'm running into istio sidecar issues (it appears as if the clientside sidecar does not recognize that traffic is going to another pod with an istio sidecar, so it doesnt send a client certificate, which the docs suggest these 2 parameters can fix by allowing istio to do the Service balancing and not ebpf).

FelixConfig:

spec:                                                                                                                                                                                                                                            
  bpfConnectTimeLoadBalancing: Disabled                                                                                                                                                                                                          
  bpfExternalServiceMode: DSR                                                                                                                                                                                                                    
  bpfHostNetworkedNATWithoutCTLB: Enabled                                                                                                                                                                                                        
  bpfLogLevel: ""                                                                                                                                                                                                                                
  failsafeInboundHostPorts:                                                                                                                                                                                                                      
  - net: ""                                                                                                                                                                                                                                      
    port: 22                                                                                                                                                                                                                                     
    protocol: tcp                                                                                                                                                                                                                                
  - net: ""                                                                                                                                                                                                                                      
    port: 6443                                                                                                                                                                                                                                   
    protocol: tcp                                                                                                                                                                                                                                
  floatingIPs: Disabled                                                                                                                                                                                                                          
  healthPort: 9099                                                                                                                                                                                                                               
  logSeverityFile: Warning                                                                                                                                                                                                                       
  logSeverityScreen: Warning                                                                                                                                                                                                                     
  logSeveritySys: Warning                                                                                                                                                                                                                        
  prometheusMetricsEnabled: true                                                                                                                                                                                                                 
  reportingInterval: 0s                                                                                                                                                                                                                          
  vxlanVNI: 4096                                                                                                                                                                                                                                 
  wireguardEnabled: true