Open rooque opened 3 months ago
Hi, I spent some time with some of the cilium maintainers to understand this issue better. Here's what has been found so far.
The fdqn cache
keeps two kinds of records, lookups
which is where a name is resolved to an IP from an external DNS server, this comes with a TTL record. And connections
track active connections in the datapath from pods, but these do not have a TTL set.
In your fqdn cache
it seems like there are no lookups
because the TTL has expired, however, you still have connections because the application is still communicating externally. We expect that your application is keeping a long-lived connection and not making a subsequent lookup again, we couldn't see any DNS lookups from the pods in the hubble flows from the sysdump, and there are no SYN
flows in the flows either to those identities.
Below is a quick capture from my lap showing the lookup
records in the fdqn cache
.
k exec -n kube-system cilium-7hdqd -it -- cilium-dbg fqdn cache list
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
Endpoint Source FQDN TTL ExpirationTime IPs
746 connection jobs-app-kafka-brokers.tenant-jobs.svc.cluster.local. 0 0001-01-01T00:00:00.000Z 10.244.2.48
1279 lookup api.github.com. 2 2024-04-04T15:48:23.486Z 140.82.121.5
1279 connection loader.tenant-jobs.svc.cluster.local. 0 0001-01-01T00:00:00.000Z 10.109.69.105
1279 connection api.github.com. 0 0001-01-01T00:00:00.000Z 140.82.121.6
602 connection elasticsearch-master.tenant-jobs.svc.cluster.local. 0 0001-01-01T00:00:00.000Z 10.102.40.97
350 connection jobs-app-zookeeper-client.tenant-jobs.svc.cluster.local. 0 0001-01-01T00:00:00.000Z 10.104.222.238
350 connection jobs-app-kafka-0.jobs-app-kafka-brokers.tenant-jobs.svc.cluster.local. 0 0001-01-01T00:00:00.000Z 10.244.2.48
We looked into the following hubble code which pulls the IP/FQDN from the cache.
At the moment it seems like the behaviour is working as expected, or rather coded. However, we think there is an opportunity to improve this behaviour.
Using the lookup
field looks like the right thing to do, because using the connection
field, there is no guarantee that the FQDN and IP remain the same throughout a long-lived connection, the DNS record could be updated during that time to a new IP address.
However, because of that, we find the situation you have logged arises. We could fall back to using the connection
item if there is no lookup
item, and flag this in the hubble observe
output command in some way, so that you know it's a best effort FQDN print out.
That would look something like this (example with an *):
Apr 2 15:56:45.180: services/poc-bff-66f46654c-7565m:45148 (ID:103419) -> redis.sandbox.whitelabel.com.br*:6379 (ID:16777220) to-stack FORWARDED (TCP Flags: ACK)
In this scenario where the TTL has expired for the lookup
item in the cache, what would you like to happen?
Hello @saintdle !
What I expect is to see those FQDNs in hubble and not "world", even for long lived connections. If the lookup is expired but there still a connection, it should show the FQDN and not "world".
That's make sense?
ps. sorry for the delay.
Hello @saintdle !
What I expect is to see those FQDNs in hubble and not "world", even for long lived connections. If the lookup is expired but there still a connection, it should show the FQDN and not "world".
That's make sense?
ps. sorry for the delay.
Yes sure, so in that case, when the FQDN is shown but it's from a connection, as the TTL has expired, then it would be best to mark it as such.
Hey @rooque just wondering if you have any issues with egress toFQDNs
policies because of that. I see the same behavior on Hubble and when I tried to create a CNP using matchName
I started getting dropped packets, though the hostname was allowed by the policy. Not saying it's related but it sorta makes sense that cilium would drop reserved:world
packets.
@rooque I stumbled on this in the docs, maybe it's useful for this use case currently https://docs.cilium.io/en/latest/contributing/development/debugging/#unintended-dns-policy-drops
I'm having some strange behavior with FQDN destination in Hubble. I have 2 FQDNs, that 2 pods connects :
At first, the FQDN appears both in the service map and in the list correctly. After a few moments, it starts treating the FQDNs' IPs as "world". (see images)
Images
The problem is not just in the UI, in the CLI is the same
NetworkPolicy
The FQDNs shows in this lists, always even after it starting showing as WORLD
Also: Every time I restart the pods, I shows the FQDNs for a very short time... Then it start showing as World again.
Cilium Config
Client: 1.15.2 7cf57829 2024-03-13T15:34:43+02:00 go version go1.21.8 linux/amd64 Daemon: 1.15.2 7cf57829 2024-03-13T15:34:43+02:00 go version go1.21.8 linux/amd64 ```yaml agent-not-ready-taint-key: node.cilium.io/agent-not-ready arping-refresh-period: 30s auto-direct-node-routes: 'false' bpf-lb-acceleration: disabled bpf-lb-external-clusterip: 'false' bpf-lb-map-max: '65536' bpf-lb-sock: 'false' bpf-map-dynamic-size-ratio: '0.0025' bpf-policy-map-max: '16384' bpf-root: /sys/fs/bpf cgroup-root: /run/cilium/cgroupv2 cilium-endpoint-gc-interval: 5m0s cluster-id: '1' cluster-name: gke-1 cni-exclusive: 'true' cni-log-file: /var/run/cilium/cilium-cni.log controller-group-metrics: write-cni-file sync-host-ips sync-lb-maps-with-k8s-services custom-cni-conf: 'false' debug: 'false' debug-verbose: '' dnsproxy-enable-transparent-mode: 'true' egress-gateway-reconciliation-trigger-interval: 1s enable-auto-protect-node-port-range: 'true' enable-bgp-control-plane: 'false' enable-bpf-clock-probe: 'false' enable-endpoint-health-checking: 'true' enable-endpoint-routes: 'true' enable-envoy-config: 'true' enable-external-ips: 'false' enable-health-check-loadbalancer-ip: 'true' enable-health-check-nodeport: 'true' enable-health-checking: 'true' enable-host-port: 'false' enable-hubble: 'true' enable-hubble-open-metrics: 'true' enable-ipv4: 'true' enable-ipv4-big-tcp: 'false' enable-ipv4-masquerade: 'true' enable-ipv6: 'false' enable-ipv6-big-tcp: 'false' enable-ipv6-masquerade: 'true' enable-k8s-networkpolicy: 'true' enable-k8s-terminating-endpoint: 'true' enable-l2-neigh-discovery: 'true' enable-l7-proxy: 'true' enable-local-redirect-policy: 'false' enable-masquerade-to-route-source: 'false' enable-metrics: 'true' enable-node-port: 'false' enable-policy: default enable-remote-node-identity: 'true' enable-sctp: 'false' enable-svc-source-range-check: 'true' enable-vtep: 'false' enable-well-known-identities: 'false' enable-wireguard: 'true' enable-xt-socket-fallback: 'true' external-envoy-proxy: 'true' hubble-disable-tls: 'false' hubble-export-file-max-backups: '5' hubble-export-file-max-size-mb: '10' hubble-listen-address: ':4244' hubble-metrics: >- dns drop tcp flow port-distribution icmp httpV2:exemplars=true;labelsContext=source_ip,source_namespace,source_workload,destination_ip,destination_namespace,destination_workload,traffic_direction hubble-metrics-server: ':9965' hubble-socket-path: /var/run/cilium/hubble.sock hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key identity-allocation-mode: crd identity-gc-interval: 15m0s identity-heartbeat-timeout: 30m0s install-no-conntrack-iptables-rules: 'false' ipam: kubernetes ipam-cilium-node-update-rate: 15s ipv4-native-routing-cidr: 10.0.0.0/18 k8s-client-burst: '20' k8s-client-qps: '10' kube-proxy-replacement: 'false' kube-proxy-replacement-healthz-bind-address: '' loadbalancer-l7: envoy loadbalancer-l7-algorithm: round_robin loadbalancer-l7-ports: '' max-connected-clusters: '255' mesh-auth-enabled: 'true' mesh-auth-gc-interval: 5m0s mesh-auth-queue-size: '1024' mesh-auth-rotated-identities-queue-size: '1024' monitor-aggregation: medium monitor-aggregation-flags: all monitor-aggregation-interval: 5s node-port-bind-protection: 'true' nodes-gc-interval: 5m0s operator-api-serve-addr: 127.0.0.1:9234 operator-prometheus-serve-addr: ':9963' policy-cidr-match-mode: '' preallocate-bpf-maps: 'false' procfs: /host/proc prometheus-serve-addr: ':9962' proxy-connect-timeout: '2' proxy-max-connection-duration-seconds: '0' proxy-max-requests-per-connection: '0' remove-cilium-node-taints: 'true' routing-mode: native service-no-backend-response: reject set-cilium-is-up-condition: 'true' set-cilium-node-taints: 'true' sidecar-istio-proxy-image: cilium/istio_proxy skip-cnp-status-startup-clean: 'false' synchronize-k8s-nodes: 'true' tofqdns-dns-reject-response-code: refused tofqdns-enable-dns-compression: 'true' tofqdns-endpoint-max-ip-per-hostname: '50' tofqdns-idle-connection-grace-period: 0s tofqdns-max-deferred-connection-deletes: '10000' tofqdns-proxy-response-max-delay: 100ms unmanaged-pod-watcher-interval: '15' vtep-cidr: '' vtep-endpoint: '' vtep-mac: '' vtep-mask: '' wireguard-persistent-keepalive: 0s write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist ```
SysDump
[cilium-sysdump-20240403-174046.zip](https://github.com/cilium/hubble/files/14858310/cilium-sysdump-20240403-174046.zip)