Open youwalther65 opened 5 months ago
Good idea and would be useful for any topology-aware mechanisms, including the recent 1.30 Service Traffic Distribution.
According to 1.16 release blog section Filtering Hubble flows by node labels filtering by node label topology.kubernetes.io/zone
is now possible.
Seeing the node label in the CLI JSON output, for example
{
"flow": {
"time": "2024-08-20T13:44:39.452593052Z",
"uuid": "87b881cf-6afd-4b81-9c0e-a095ba251769",
"verdict": "FORWARDED",
"ethernet": {
"source": "b6:f8:6f:cf:05:30",
"destination": "86:32:eb:cf:a9:f7"
},
"IP": {
"source": "10.128.0.158",
"destination": "10.128.2.85",
"ipVersion": "IPv4"
},
"l4": {
"UDP": {
"source_port": 41470,
"destination_port": 53
}
},
"source": {
"identity": 9456,
"cluster_name": "kind-cilium-migration",
"namespace": "default",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=default",
"k8s:run=tmp-shell"
],
"pod_name": "tmp-shell"
},
"destination": {
"ID": 4071,
"identity": 21731,
"namespace": "kube-system",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=coredns",
"k8s:io.kubernetes.pod.namespace=kube-system",
"k8s:k8s-app=kube-dns"
],
"pod_name": "coredns-7db6d8ff4d-hvmhb",
"workloads": [
{
"name": "coredns",
"kind": "Deployment"
}
]
},
"Type": "L3_L4",
"node_name": "kind-cilium-migration/cilium-migration-worker2",
"node_labels": [
"beta.kubernetes.io/arch=arm64",
"beta.kubernetes.io/os=linux",
"io.cilium.migration/cilium-default=true",
"kubernetes.io/arch=arm64",
"kubernetes.io/hostname=cilium-migration-worker2",
"kubernetes.io/os=linux",
"topology.kubernetes.io/zone=eu-central-1b"
],
"event_type": {
"type": 4
},
"traffic_direction": "EGRESS",
"trace_observation_point": "TO_ENDPOINT",
"trace_reason": "NEW",
"is_reply": false,
"interface": {
"index": 33,
"name": "lxc0bf9b51842a1"
},
"Summary": "UDP"
},
"node_name": "kind-cilium-migration/cilium-migration-worker2",
"time": "2024-08-20T13:44:39.452593052Z"
}
what does the flow.node_labels refer to, the destination or source?
coredns-7db6d8ff4d-hvmhb runs on cilium-migration-worker2 with label topology.kubernetes.io/zone=eu-central-1b
tmp-shell runs on cilium-migration-worker with label topology.kubernetes.io/zone=eu-central-1a
so I would assume that the given node_label, as it states topology.kubernetes.io/zone=eu-central-1b is that of the destination.
However when looking at the reply, the label is the same, although the destination is in eu-central-1a:
{
"flow": {
"time": "2024-08-20T13:44:39.455688552Z",
"uuid": "bcd4672c-0b6f-4a0a-9b37-e6fe481bfeb2",
"verdict": "FORWARDED",
"ethernet": {
"source": "86:32:eb:cf:a9:f7",
"destination": "b6:f8:6f:cf:05:30"
},
"IP": {
"source": "10.128.2.85",
"destination": "10.128.0.158",
"ipVersion": "IPv4"
},
"l4": {
"UDP": {
"source_port": 53,
"destination_port": 41470
}
},
"source": {
"ID": 4071,
"identity": 21731,
"namespace": "kube-system",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=kube-system",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=coredns",
"k8s:io.kubernetes.pod.namespace=kube-system",
"k8s:k8s-app=kube-dns"
],
"pod_name": "coredns-7db6d8ff4d-hvmhb",
"workloads": [
{
"name": "coredns",
"kind": "Deployment"
}
]
},
"destination": {
"identity": 9456,
"cluster_name": "kind-cilium-migration",
"namespace": "default",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=default",
"k8s:run=tmp-shell"
],
"pod_name": "tmp-shell"
},
"Type": "L3_L4",
"node_name": "kind-cilium-migration/cilium-migration-worker2",
"node_labels": [
"beta.kubernetes.io/arch=arm64",
"beta.kubernetes.io/os=linux",
"io.cilium.migration/cilium-default=true",
"kubernetes.io/arch=arm64",
"kubernetes.io/hostname=cilium-migration-worker2",
"kubernetes.io/os=linux",
"topology.kubernetes.io/zone=eu-central-1b"
],
"reply": true,
"event_type": {
"type": 4,
"sub_type": 4
},
"traffic_direction": "INGRESS",
"trace_observation_point": "TO_OVERLAY",
"trace_reason": "REPLY",
"is_reply": true,
"interface": {
"index": 15,
"name": "cilium_vxlan"
},
"Summary": "UDP"
},
"node_name": "kind-cilium-migration/cilium-migration-worker2",
"time": "2024-08-20T13:44:39.455688552Z"
}
Traffic to and from the public internet also contains the node_label both times:
{
"flow": {
"time": "2024-08-20T13:42:18.455358584Z",
"uuid": "d5dab447-5c3f-4cb5-ac6e-3ee28b20461e",
"verdict": "FORWARDED",
"ethernet": {
"source": "56:0c:5b:59:0a:1f",
"destination": "ce:b5:4e:66:6c:6b"
},
"IP": {
"source": "10.128.0.158",
"destination": "142.250.181.195",
"ipVersion": "IPv4"
},
"l4": {
"TCP": {
"source_port": 56304,
"destination_port": 80,
"flags": {
"SYN": true
}
}
},
"source": {
"ID": 1703,
"identity": 9456,
"namespace": "default",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=default",
"k8s:run=tmp-shell"
],
"pod_name": "tmp-shell"
},
"destination": {
"identity": 2,
"labels": [
"reserved:world"
]
},
"Type": "L3_L4",
"node_name": "kind-cilium-migration/cilium-migration-worker",
"node_labels": [
"beta.kubernetes.io/arch=arm64",
"beta.kubernetes.io/os=linux",
"io.cilium.migration/cilium-default=true",
"kubernetes.io/arch=arm64",
"kubernetes.io/hostname=cilium-migration-worker",
"kubernetes.io/os=linux",
"topology.kubernetes.io/zone=eu-central-1a"
],
"event_type": {
"type": 4,
"sub_type": 3
},
"traffic_direction": "EGRESS",
"trace_observation_point": "TO_STACK",
"trace_reason": "NEW",
"is_reply": false,
"Summary": "TCP Flags: SYN"
},
"node_name": "kind-cilium-migration/cilium-migration-worker",
"time": "2024-08-20T13:42:18.455358584Z"
}
{
"flow": {
"time": "2024-08-20T13:42:18.456015417Z",
"uuid": "0736d96c-ac3e-4c64-89f7-cee617a3d840",
"verdict": "FORWARDED",
"ethernet": {
"source": "ce:b5:4e:66:6c:6b",
"destination": "56:0c:5b:59:0a:1f"
},
"IP": {
"source": "142.250.181.195",
"destination": "10.128.0.158",
"ipVersion": "IPv4"
},
"l4": {
"TCP": {
"source_port": 80,
"destination_port": 56304,
"flags": {
"SYN": true,
"ACK": true
}
}
},
"source": {
"identity": 2,
"labels": [
"reserved:world"
]
},
"destination": {
"ID": 1703,
"identity": 9456,
"namespace": "default",
"labels": [
"k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default",
"k8s:io.cilium.k8s.policy.cluster=kind-cilium-migration",
"k8s:io.cilium.k8s.policy.serviceaccount=default",
"k8s:io.kubernetes.pod.namespace=default",
"k8s:run=tmp-shell"
],
"pod_name": "tmp-shell"
},
"Type": "L3_L4",
"node_name": "kind-cilium-migration/cilium-migration-worker",
"node_labels": [
"beta.kubernetes.io/arch=arm64",
"beta.kubernetes.io/os=linux",
"io.cilium.migration/cilium-default=true",
"kubernetes.io/arch=arm64",
"kubernetes.io/hostname=cilium-migration-worker",
"kubernetes.io/os=linux",
"topology.kubernetes.io/zone=eu-central-1a"
],
"reply": true,
"event_type": {
"type": 4
},
"traffic_direction": "EGRESS",
"trace_observation_point": "TO_ENDPOINT",
"trace_reason": "REPLY",
"is_reply": true,
"interface": {
"index": 39,
"name": "lxcd05d3f2b91a5"
},
"Summary": "TCP Flags: SYN, ACK"
},
"node_name": "kind-cilium-migration/cilium-migration-worker",
"time": "2024-08-20T13:42:18.456015417Z"
}
Any help interpreting this? Ideally I want to be able to export those logs to analyse traffic going from node in AZ A to nodes in AZ B, or traffic coming and going to the public internet.
Alright #34133 provides some guidance on interpretation and the feature seems like just we I need, so I will follow this issue closely. Thanks for the great work! :)
Cilium Feature Proposal
Is your proposed feature related to a problem?
For this example I will use AWS and Amazon EKS naming conventions. Similar concepts might apply to other cloud vendors as well: For resiliency creating EKS worker nodes in multiple availability zones (AZ) is a best practice. Traffic from pod to pod most likely will pass AZ. But cross-AZ traffic incur some cost. Especially in Cilium overlay networking visibility into cross-AZ traffic is not possible with AWS features like VPC Flow logs. Even Hubble CLI/UI missing this feature currently.
Describe the feature you'd like
EKS worker nodes come with K8s well-known label topology.kubernetes.io/zone by default, example
topology.kubernetes.io/zone: eu-west-1b
. Embedding this information into flows as labels would make it possible to use Hubble queries with-from-label
and--to-label
and allow to see flows crossing AZ. Using this one could identify potential applications. I am still not sure if it is possible to count the packets/bytes to have a view into total traffic/time somehow which would help identifying top talkers.