Closed ycydk closed 1 year ago
Hi @ycydk and thank you for reporting the issue. Could you please attach a sysdump from the environment where you noticed this?
Hi, @pippolo84 thank you for your reply. I got the sysdump file from my environment. I also found if i use the config as follow, it will be ok.
agent-not-ready-taint-key: node.cilium.io/agent-not-ready arping-refresh-period: 30s auto-direct-node-routes: "true" bpf-lb-external-clusterip: "false" bpf-lb-map-max: "65536" bpf-lb-sock: "true" bpf-lb-sock-hostns-only: "true" bpf-map-dynamic-size-ratio: "0.0025" bpf-policy-map-max: "16384" bpf-root: /sys/fs/bpf cgroup-root: /run/cilium/cgroupv2 cilium-endpoint-gc-interval: 5m0s cluster-id: "0" cluster-name: default custom-cni-conf: "false" debug: "false" disable-cnp-status-updates: "true" disable-endpoint-crd: "false" enable-auto-protect-node-port-range: "true" enable-bandwidth-manager: "true" enable-bgp-control-plane: "false" enable-bpf-clock-probe: "true" enable-bpf-masquerade: "true" enable-bpf-tproxy: "true" enable-endpoint-health-checking: "true" enable-endpoint-routes: "false" enable-envoy-config: "true" enable-health-check-nodeport: "true" enable-health-checking: "true" enable-host-firewall: "false" enable-host-legacy-routing: "false" enable-host-port: "false" enable-hubble: "true" enable-ingress-controller: "true" enable-ingress-secrets-sync: "true" enable-ipv4: "true" enable-ipv4-masquerade: "true" enable-ipv6: "false" enable-ipv6-masquerade: "true" enable-k8s-terminating-endpoint: "true" enable-l2-neigh-discovery: "true" enable-l7-proxy: "true" enable-local-node-route: "true" enable-local-redirect-policy: "false" enable-node-port: "false" enable-policy: default enable-remote-node-identity: "true" enable-svc-source-range-check: "true" enable-vtep: "false" enable-well-known-identities: "false" enable-xt-socket-fallback: "true" enforce-ingress-https: "true" hubble-disable-tls: "false" hubble-listen-address: :4244 hubble-socket-path: /var/run/cilium/hubble.sock hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key identity-allocation-mode: crd ingress-lb-annotation-prefixes: service.beta.kubernetes.io service.kubernetes.io cloud.google.com ingress-secrets-namespace: cilium-secrets install-iptables-rules: "true" install-no-conntrack-iptables-rules: "false" ipam: kubernetes ipv4-native-routing-cidr: 172.20.0.0/16 kube-proxy-replacement: strict kube-proxy-replacement-healthz-bind-address: "" monitor-aggregation: medium monitor-aggregation-flags: all monitor-aggregation-interval: 5s node-port-bind-protection: "true" nodes-gc-interval: 5m0s operator-api-serve-addr: 127.0.0.1:9234 preallocate-bpf-maps: "false" procfs: /host/proc remove-cilium-node-taints: "true" set-cilium-is-up-condition: "true" sidecar-istio-proxy-image: cilium/istio_proxy synchronize-k8s-nodes: "true" tofqdns-dns-reject-response-code: refused tofqdns-enable-dns-compression: "true" tofqdns-endpoint-max-ip-per-hostname: "50" tofqdns-idle-connection-grace-period: 0s tofqdns-max-deferred-connection-deletes: "10000" tofqdns-min-ttl: "3600" tofqdns-proxy-response-max-delay: 100ms tunnel: disabled unmanaged-pod-watcher-interval: "15" vtep-cidr: "" vtep-endpoint: "" vtep-mac: "" vtep-mask: ""
Hi @ycydk , thank you for the additional info.
I'll leave here the differences between the two configs you reported:
As Fabio highlighted (no pun intended), there are several differences between the two setups. Could you try to see with just the enable-bpf-masquerade
enabled if it makes a difference?
It would also be useful to have a packet trace of the failing request in the faulty config with cilium monitor
or Hubble observe
. To collect a full trace, you probably want to disable monitor aggregation first (monitor-aggregation=false
).
Hi, i did some test. If just the enable-bpf-masquerade enabled, it doesn't work. If the enable-bpf-masquerade and the enable-ipv4-masquerade disabled, it works for servicemesh. If delete the ebpf program sec("to-netdev"), it works in this node.
So with BPF masquerading it doesn't work and without it works.
Is the service you are trying to reach outside the cluster? Should packets to this service be masqueraded? Are they (you can use tcpdump on the native device to confirm that)?
The service is in the cluster (clusterip service). Packets should not be masqueraded. But if masqueraded is disabled, packets trying to reach outside the cluster will be dropped. I dump the packet on the native device and i found that packet to the svc is not masqueraded.
I'm confused. Is it failing with or without BPF masquerading?
it's failing if trying to access a service mesh svc in cluster from a pod (not hostnetwork) when using the first configuration i mentioned (with enable-ipv4-masquerade=true). if enable-bpf-masquerade=true, it's failing too. But it works using the second configuration.
failing config
agent-not-ready-taint-key: node.cilium.io/agent-not-ready arping-refresh-period: 30s auto-direct-node-routes: "true" bpf-lb-external-clusterip: "false" bpf-lb-map-max: "65536" bpf-map-dynamic-size-ratio: "0.0025" bpf-policy-map-max: "16384" bpf-root: /sys/fs/bpf cgroup-root: /run/cilium/cgroupv2 cilium-endpoint-gc-interval: 5m0s cluster-id: "0" cluster-name: default custom-cni-conf: "false" debug: "false" disable-cnp-status-updates: "true" disable-endpoint-crd: "false" enable-auto-protect-node-port-range: "true" enable-bandwidth-manager: "true" enable-bgp-control-plane: "false" enable-bpf-clock-probe: "true" enable-bpf-masquerade: "true" enable-bpf-tproxy: "true" enable-endpoint-health-checking: "true" enable-endpoint-routes: "false" enable-envoy-config: "true" enable-health-check-nodeport: "true" enable-health-checking: "true" enable-host-legacy-routing: "false" enable-hubble: "true" enable-ingress-controller: "true" enable-ingress-secrets-sync: "true" enable-ipv4: "true" enable-ipv4-masquerade: "true" enable-ipv6: "false" enable-ipv6-masquerade: "true" enable-k8s-terminating-endpoint: "true" enable-l2-neigh-discovery: "true" enable-l7-proxy: "true" enable-local-node-route: "true" enable-local-redirect-policy: "false" enable-policy: default enable-remote-node-identity: "true" enable-svc-source-range-check: "true" enable-vtep: "false" enable-well-known-identities: "false" enable-xt-socket-fallback: "true" enforce-ingress-https: "true" hubble-disable-tls: "false" hubble-listen-address: :4244 hubble-socket-path: /var/run/cilium/hubble.sock hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key identity-allocation-mode: crd ingress-lb-annotation-prefixes: service.beta.kubernetes.io service.kubernetes.io cloud.google.com ingress-secrets-namespace: cilium-secrets install-iptables-rules: "true" install-no-conntrack-iptables-rules: "false" ipam: kubernetes ipv4-native-routing-cidr: 172.20.0.0/16 kube-proxy-replacement: strict kube-proxy-replacement-healthz-bind-address: "" monitor-aggregation: medium monitor-aggregation-flags: all monitor-aggregation-interval: 5s node-port-bind-protection: "true" nodes-gc-interval: 5m0s operator-api-serve-addr: 127.0.0.1:9234 preallocate-bpf-maps: "false" procfs: /host/proc remove-cilium-node-taints: "true" set-cilium-is-up-condition: "true" sidecar-istio-proxy-image: cilium/istio_proxy synchronize-k8s-nodes: "true" tofqdns-dns-reject-response-code: refused tofqdns-enable-dns-compression: "true" tofqdns-endpoint-max-ip-per-hostname: "50" tofqdns-idle-connection-grace-period: 0s tofqdns-max-deferred-connection-deletes: "10000" tofqdns-min-ttl: "3600" tofqdns-proxy-response-max-delay: 100ms tunnel: disabled unmanaged-pod-watcher-interval: "15" vtep-cidr: "" vtep-endpoint: "" vtep-mac: "" vtep-mask: ""
working config:
agent-not-ready-taint-key: node.cilium.io/agent-not-ready arping-refresh-period: 30s auto-direct-node-routes: "true" bpf-lb-external-clusterip: "false" bpf-lb-map-max: "65536" bpf-lb-sock: "true" bpf-lb-sock-hostns-only: "true" bpf-map-dynamic-size-ratio: "0.0025" bpf-policy-map-max: "16384" bpf-root: /sys/fs/bpf cgroup-root: /run/cilium/cgroupv2 cilium-endpoint-gc-interval: 5m0s cluster-id: "0" cluster-name: default custom-cni-conf: "false" debug: "false" disable-cnp-status-updates: "true" disable-endpoint-crd: "false" enable-auto-protect-node-port-range: "true" enable-bandwidth-manager: "true" enable-bgp-control-plane: "false" enable-bpf-clock-probe: "true" enable-bpf-masquerade: "true" enable-bpf-tproxy: "true" enable-endpoint-health-checking: "true" enable-endpoint-routes: "false" enable-envoy-config: "true" enable-health-check-nodeport: "true" enable-health-checking: "true" enable-host-firewall: "false" enable-host-legacy-routing: "false" enable-host-port: "false" enable-hubble: "true" enable-ingress-controller: "true" enable-ingress-secrets-sync: "true" enable-ipv4: "true" enable-ipv4-masquerade: "true" enable-ipv6: "false" enable-ipv6-masquerade: "true" enable-k8s-terminating-endpoint: "true" enable-l2-neigh-discovery: "true" enable-l7-proxy: "true" enable-local-node-route: "true" enable-local-redirect-policy: "false" enable-node-port: "false" enable-policy: default enable-remote-node-identity: "true" enable-svc-source-range-check: "true" enable-vtep: "false" enable-well-known-identities: "false" enable-xt-socket-fallback: "true" enforce-ingress-https: "true" hubble-disable-tls: "false" hubble-listen-address: :4244 hubble-socket-path: /var/run/cilium/hubble.sock hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key identity-allocation-mode: crd ingress-lb-annotation-prefixes: service.beta.kubernetes.io service.kubernetes.io cloud.google.com ingress-secrets-namespace: cilium-secrets install-iptables-rules: "true" install-no-conntrack-iptables-rules: "false" ipam: kubernetes ipv4-native-routing-cidr: 172.20.0.0/16 kube-proxy-replacement: strict kube-proxy-replacement-healthz-bind-address: "" monitor-aggregation: medium monitor-aggregation-flags: all monitor-aggregation-interval: 5s node-port-bind-protection: "true" nodes-gc-interval: 5m0s operator-api-serve-addr: 127.0.0.1:9234 preallocate-bpf-maps: "false" procfs: /host/proc remove-cilium-node-taints: "true" set-cilium-is-up-condition: "true" sidecar-istio-proxy-image: cilium/istio_proxy synchronize-k8s-nodes: "true" tofqdns-dns-reject-response-code: refused tofqdns-enable-dns-compression: "true" tofqdns-endpoint-max-ip-per-hostname: "50" tofqdns-idle-connection-grace-period: 0s tofqdns-max-deferred-connection-deletes: "10000" tofqdns-min-ttl: "3600" tofqdns-proxy-response-max-delay: 100ms tunnel: disabled unmanaged-pod-watcher-interval: "15" vtep-cidr: "" vtep-endpoint: "" vtep-mac: "" vtep-mask: ""
Diff between those two configurations:
5a6,7
> bpf-lb-sock: "true"
> bpf-lb-sock-hostns-only: "true"
27a30
> enable-host-firewall: "false"
28a32
> enable-host-port: "false"
40a45
> enable-node-port: "false"
You've set enable-host-firewall to its default value, false, so that doesn't make any difference. I believe bpf-lb-sock
, enable-host-port
, and enable-node-port
will be forced to true anyway because you have kube-proxy-replacement: strict
(you can confirm it with the agent logs).
So the only remaining difference is bpf-lb-sock-hostns-only: true
. Could you try to change only that flag to confirm it is the culprit?
it works if bpf-lb-sock-hostns-only: false
config:
agent-not-ready-taint-key: node.cilium.io/agent-not-ready arping-refresh-period: 30s auto-direct-node-routes: "true" bpf-lb-external-clusterip: "false" bpf-lb-map-max: "65536" bpf-lb-sock: "true" bpf-lb-sock-hostns-only: "false" bpf-map-dynamic-size-ratio: "0.0025" bpf-policy-map-max: "16384" bpf-root: /sys/fs/bpf cgroup-root: /run/cilium/cgroupv2 cilium-endpoint-gc-interval: 5m0s cluster-id: "0" cluster-name: default custom-cni-conf: "false" debug: "false" disable-cnp-status-updates: "true" disable-endpoint-crd: "false" enable-auto-protect-node-port-range: "true" enable-bandwidth-manager: "true" enable-bgp-control-plane: "false" enable-bpf-clock-probe: "true" enable-bpf-masquerade: "true" enable-bpf-tproxy: "true" enable-endpoint-health-checking: "true" enable-endpoint-routes: "false" enable-envoy-config: "true" enable-health-check-nodeport: "true" enable-health-checking: "true" enable-host-firewall: "false" enable-host-legacy-routing: "false" enable-host-port: "false" enable-hubble: "true" enable-ingress-controller: "true" enable-ingress-secrets-sync: "true" enable-ipv4: "true" enable-ipv4-masquerade: "true" enable-ipv6: "false" enable-ipv6-masquerade: "true" enable-k8s-terminating-endpoint: "true" enable-l2-neigh-discovery: "true" enable-l7-proxy: "true" enable-local-node-route: "true" enable-local-redirect-policy: "false" enable-node-port: "false" enable-policy: default enable-remote-node-identity: "true" enable-svc-source-range-check: "true" enable-vtep: "false" enable-well-known-identities: "false" enable-xt-socket-fallback: "true" enforce-ingress-https: "true" hubble-disable-tls: "false" hubble-listen-address: :4244 hubble-socket-path: /var/run/cilium/hubble.sock hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key identity-allocation-mode: crd ingress-lb-annotation-prefixes: service.beta.kubernetes.io service.kubernetes.io cloud.google.com ingress-secrets-namespace: cilium-secrets install-iptables-rules: "true" install-no-conntrack-iptables-rules: "false" ipam: kubernetes ipv4-native-routing-cidr: 172.20.0.0/16 kube-proxy-replacement: strict kube-proxy-replacement-healthz-bind-address: "" monitor-aggregation: medium monitor-aggregation-flags: all monitor-aggregation-interval: 5s node-port-bind-protection: "true" nodes-gc-interval: 5m0s operator-api-serve-addr: 127.0.0.1:9234 preallocate-bpf-maps: "false" procfs: /host/proc remove-cilium-node-taints: "true" set-cilium-is-up-condition: "true" sidecar-istio-proxy-image: cilium/istio_proxy synchronize-k8s-nodes: "true" tofqdns-dns-reject-response-code: refused tofqdns-enable-dns-compression: "true" tofqdns-endpoint-max-ip-per-hostname: "50" tofqdns-idle-connection-grace-period: 0s tofqdns-max-deferred-connection-deletes: "10000" tofqdns-min-ttl: "3600" tofqdns-proxy-response-max-delay: 100ms tunnel: disabled unmanaged-pod-watcher-interval: "15" vtep-cidr: "" vtep-endpoint: "" vtep-mac: "" vtep-mask: ""
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
This issue has not seen any activity since it was marked stale. Closing.
Is there an existing issue for this?
What happened?
cilium config
what happened
i create an ingress as
http request from hostns
dest pod in remote node
http request from pod
dest pod in remote node
tcpdump
dump the packet from envoy to remote pod, i found that the dest mac addr in the packet is wrong, the mac addr should be mac addr of the remote node, but it seems to be the lxc device of source pod
Cilium Version
Client: 1.12.4 6eaecaf 2022-11-16T05:45:01+00:00 go version go1.18.8 linux/amd64 Daemon: 1.12.4 6eaecaf 2022-11-16T05:45:01+00:00 go version go1.18.8 linux/amd64
Kernel Version
5.10.0-60.56.0.84.oe2203.x86_64
Kubernetes Version
v1.24.3
Sysdump
No response
Relevant log output
No response
Anything else?
No response
Code of Conduct