cilium / cilium

eBPF-based Networking, Security, and Observability
https://cilium.io
Apache License 2.0
20.14k stars 2.96k forks source link

Document macros used in the datapath #15379

Open pchaigno opened 3 years ago

pchaigno commented 3 years ago

We use a lot of different macros in the datapath, to define config. values and enable/disable features. It might be a good idea to document all those in e.g. a bpf/README.md. Hopefully, some quick script can help us keep the list up-to-date. The option names are often not enough to clarify the semantics/intents (e.g., IPV4_LOOPBACK or SECCTX_FROM_IPCACHE). Macros defined in the code (i.e., constants) should already have comments and may not need to be part of the list.

Doing so would likely uncover a couple inconsistencies and stale macros.

aditighag commented 3 years ago

Ah, thanks for filing this issue. :)

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

pchaigno commented 3 years ago

WIP list of macros used in the datapath:

Click to show. ### Macros | Macro | Values | Description | |----------------------------------|----------|------------------------------------------------------------------------------| | ACTION_UNKNOWN_ICMP6_NS | int | BPF action to apply for unknown ICMPv6 NS messages. | | BPF_ENV_MNT | string | Name of the environment variable holding the bpffs location. | | CALLS_MAP | string | Name of the program array for internal tail calls. | | CAPTURE4_RULES | string | Name of the hashmap with IPv4 filters for packet captures. | | CAPTURE4_SIZE | u32 | Size of the hashmap with IPv4 filters for packet captures. | | CAPTURE6_RULES | string | Name of the hashmap with IPv6 filters for packet captures. | | CAPTURE6_SIZE | u32 | Size of the hashmap with IPv6 filters for packet captures. | | CIDR4_FILTER | undef, 1 | Compile in IPv4 prefilter feature in bpf_xdp. | | CIDR4_HMAP_ELEMS | u32 | Size of the IPv4 and IPv6 hashmaps holding CIDRs for the prefilter feature. | | CIDR4_HMAP_NAME | string | Name of the hashmap holding IPv4 CIDRs for the prefilter feature. | | CIDR4_LMAP_ELEMS | u32 | Size of the IPv4 and IPv6 LPM tries holding CIDRs for the prefilter feature. | | CIDR4_LMAP_NAME | string | Name of the LPM trie holding IPv4 CIDRs for the prefilter feature. | | CIDR4_LPM_PREFILTER | undef, 1 | Use the IPv4 LPM trie for the prefilter feature. | | CIDR6_FILTER | undef, 1 | Compile in IPv6 prefilter feature in bpf_xdp. | | CIDR6_HMAP_NAME | string | Name of the hashmap holding IPv6 CIDRs for the prefilter feature. | | CIDR6_LMAP_NAME | string | Name of the LPM trie holding IPv6 CIDRs for the prefilter feature. | | CIDR6_LPM_PREFILTER | undef, 1 | Use the IPv6 LPM trie for the prefilter feature. | | CILIUM_IFINDEX | u32 | Index of the cilium_host interface on the host. | | CILIUM_IPV4_FRAG_MAP_MAX_ENTRIES | u32 | Size of the fragment tracking map. | | CILIUM_LB_MAP_MAX_ENTRIES | u32 | Size of the LB maps (backends, afffinity, health checks, reverse NAT, etc.). | | CILIUM_NET_MAC | union macaddr | MAC address for the cilium_net interface. | | CONNTRACK | undef, 1 | Compile in connection tracking. | | CONNTRACK_ACCOUNTING | undef, 1 | Compile in conntrack counter updates. | | CT_CLOSE_TIMEOUT | | | | CT_CONNECTION_LIFETIME_NONTCP | | | | CT_CONNECTION_LIFETIME_TCP | | | | CT_MAP_ANY4 | string | Name of the hashmap to track IPv4 non-TCP communications. | | CT_MAP_ANY6 | string | Name of the hashmap to track IPv6 non-TCP communications. | | CT_MAP_SIZE_ANY | u32 | Size of the hashmaps to track non-TCP communications. | | CT_MAP_SIZE_TCP | u32 | Size of the hashmaps to track TCP connections. | | CT_MAP_TCP4 | string | Name of the hashmap to track IPv4 TCP connections. | | CT_MAP_TCP6 | string | Name of the hashmap to track IPv6 TCP connections. | | CT_REPORT_INTERVAL | | | | CT_SERVICE_LIFETIME_NONTCP | | | | CT_SERVICE_LIFETIME_TCP | | | | CT_SYN_TIMEOUT | | | | CTX_ACT_OK | 0, 2 | Return code to pass the packet to the stack from either tc (0) or XDP (2). | | CUSTOM_CALLS_MAP | string | Name of the program array for custom hooks. | | DEBUG | undef, 1 | Compile in the collection of cilium_dbg messages to userspace. | | DIRECT_ROUTING_DEV_IFINDEX | u32 | Index of the interface used to connect nodes in the cluster. | | DISABLE_LOOPBACK_LB | undef, 1 | Compile out SNATing for the loopback load-balancing case. | | DROP_NOTIFY | undef, 1 | Compile in packet drop notifications. | | EGRESS_MAP_SIZE | u32 | Size of the egress gateway map. | | ENABLE_ARP_PASSTHROUGH | undef, 1 | Let ARP packets through to the Linux stack. | | ENABLE_ARP_RESPONDER | undef, 1 | Compile in ARP responder at lxc devices for traffic from the containers. | | ENABLE_BANDWIDTH_MANAGER | undef, 1 | Compile in egress bandwidth rate limiting. | | ENABLE_CUSTOM_CALLS | undef, 1 | Compile in hooks for the custom calls. | | ENABLE_DSR | undef, 1 | | | ENABLE_EGRESS_GATEWAY | undef, 1 | Compile in the egress gateway redirection and SNATing. | | ENABLE_EXTERNAL_IP | undef, 1 | | | ENABLE_EXTRA_HOST_DEV | undef, 1 | | | ENABLE_HEALTH_CHECK | undef, 1 | Compile in handling of health-probe traffic for L4LB. | | ENABLE_HOST_FIREWALL | undef, 1 | Compile in the host firewall and enforce host policies. | | ENABLE_HOST_SERVICES_FULL | undef, 1 | Compile out the per-packet service translation in bpf_lxc. | | ENABLE_HOST_SERVICES_PEER | undef, 1 | | | ENABLE_HOST_SERVICES_TCP | undef, 1 | | | ENABLE_HOST_SERVICES_UDP | undef, 1 | | | ENABLE_IDENTITY_MARK | undef, 1 | Encode the identity in the packet mark when leaving bpf_lxc for the stack. | | ENABLE_IPSEC | undef, 1 | Compile in IPsec-handling code. | | ENABLE_IPV4 | undef, 1 | Compile in IPv4 code paths. | | ENABLE_IPV6 | undef, 1 | Compile in IPv6 code paths. | | ENABLE_NAT46 | undef, 1 | Compile in NAT46 and NAT64 translations. | | ENABLE_NODEPORT | undef, 1 | | | ENABLE_NODEPORT_ACCELERATION | undef, 1 | Compile in the XDP-level load-balancing logic. | | ENABLE_PREFILTER | undef, 1 | Compile in the XDP prefilter feature. | | ENABLE_REDIRECT_FAST | undef, 1 | Compile in bypass of most of the stack using bpf_redirect_{peer,neigh}. | | ENABLE_ROUTING | undef, 1 | | | ENABLE_SRC_RANGE_CHECK | undef, 1 | Compile in checks of allowed source CIDRs for the service handling. | | ENABLE_WIREGUARD | undef, 1 | Compile in WireGuard-handling code. | | ENCAP_GENEVE | - | Unused | | ENCAP_VXLAN | - | Unused | | ENCAP_IFINDEX | u32 | Index of the overlay interface on the host. | | ENCAP4_IFINDEX | u32 | Index of the cilium_ipip4 interface for IPv4 IPIP encapsulation. | | ENCAP6_IFINDEX | u32 | Index of the cilium_ipip6 interface for IPv6 IPIP encapsulation. | | ENCRYPT_MAP | string | Name of the map holding the IPsec SPI (keyID) currently used by the agent. | | ENCRYPT_OR_PROXY_MAGIC | u32 | Index of the skb->cb slot holding the IPsec and proxy markers. | | ENDPOINTS_MAP | string | Name of the hashmap with local endpoints. | | ENDPOINTS_MAP_SIZE | u32 | Size of the hashmap with local endpoints. | | EPHEMERAL_MIN | u16 | Minimum number for ephemeral ports on Linux, used in the NAT logic. | | EP_POLICY_MAP | string | Name of the hash of maps to link endpoints to their cilium_policy hashmaps. | | EVENTS_MAP | string | Name of the ring buffer to send events to the agent. | | EVENT_SOURCE | u32 | Cilium identifier of the endpoint at the source of events sent to the agent. | | HASH_INIT4_SEED | u32 | Seed for Maglev's IPv4 hash algorithm. | | HASH_INIT6_SEED | u32 | Seed for Maglev's IPv6 hash algorithm. | | HEALTH_ID | 4 | Identifier for the health entity. | | HOST_EP_ID | u32 | Local Cilium identifier for the host endpoint. | | HOST_ID | 1 | Identifier for the host entity. | | HOST_IFINDEX | u32 | Index of the cilium_net interface on the host | | HOST_IFINDEX_MAC | union macaddr | MAC address for the cilium_host interface. | | INIT_ID | 5 | Identifier for the init entity. | | IPCACHE4_PREFIXES | array | IPv4 prefixes to use for the ipcache lookup when LPM maps are unsupported. | | IPCACHE6_PREFIXES | array | IPv6 prefixes to use for the ipcache lookup when LPM maps are unsupported. | | IPCACHE_MAP | string | Name of the ipcache hashmap. | | IPCACHE_MAP_SIZE | u32 | Size of the ipcache hashmap. | | IP_POOLS | | | | IPV4_FRAG_DATAGRAMS_MAP | string | Name of the fragment tracking map. | | IPV4_GATEWAY | u32 | Cilium internal IPv4 node address and gateway for endpoints. | | IPV4_LOOPBACK | u32 | IPv4 address used for service loopback SNAT. | | IPV4_MASK | u32 | IPv4 mask used on the destination IP before the lookup in the tunnel map. | | IS_BPF_HOST | undef, 1 | The code was included by bpf_host.c. | | IS_BPF_OVERLAY | undef, 1 | The code was included by bpf_overlay.c. | | IS_L3_DEV | macro | Macro to check if an interface has a MAC address from its index. | | KERNEL_HZ | u64 | HZ rate the kernel is operating in. | | LB4_AFFINITY_MAP | string | Name of the hashmap to associate IPv4 clients to their affinity session. | | LB4_BACKEND_MAP | string | Name of the hashmap with the addresses of IPv4 service backends. | | LB4_HEALTH_MAP | string | Name of the hashmap with IPv4 backends for each L4LB health-probe sockets. | | LB4_MAGLEV_MAP_INNER | string | Name of the per-service one-item lookup table for IPv4 backend selection. | | LB4_MAGLEV_MAP_OUTER | string | Name of the hashmap pointing to the one-item lookup table for IPv4 services. | | LB4_REVERSE_NAT_MAP | string | Name of the hashmap with original IPv4 and port for the per-packet rev-NAT. | | LB4_REVERSE_NAT_SK_MAP | string | Name of the hashmap with original IPv4 and port for the rev-NAT at socket. | | LB4_REVERSE_NAT_SK_MAP_SIZE | u32 | Size of the hashmap with original IPv4 and port for the rev-NAT at socket. | | LB4_SERVICES_MAP_V2 | string | Name of the hashmap to lookup IPv4 services and their backends. | | LB4_SRC_RANGE_MAP | string | Name of the LPM trie with the allowed IPv4 source CIDRs for services. | | LB4_SRC_RANGE_MAP_SIZE | u32 | Size of the LPM trie with the allowed IPv4 source CIDRs for services. | | LB6_AFFINITY_MAP | string | Name of the hashmap to associate IPv6 clients to their affinity session. | | LB6_BACKEND_MAP | string | Name of the hashmap with the addresses of IPv6 service backends. | | LB6_HEALTH_MAP | string | Name of the hashmap with IPv6 backends for each L4LB health-probe sockets. | | LB6_MAGLEV_MAP_INNER | string | Name of the per-service one-item lookup table for IPv6 backend selection. | | LB6_MAGLEV_MAP_OUTER | string | Name of the hashmap pointing to the one-item lookup table for IPv6 services. | | LB6_REVERSE_NAT_MAP | string | Name of the hashmap with original IPv6 and port for the per-packet rev-NAT. | | LB6_REVERSE_NAT_SK_MAP | string | Name of the hashmap with original IPv6 and port for the rev-NAT at socket. | | LB6_REVERSE_NAT_SK_MAP_SIZE | u32 | Name of the hashmap with original IPv6 and port for the rev-NAT at socket. | | LB6_SERVICES_MAP_V2 | string | Name of the hashmap to lookup IPv6 services and their backends. | | LB6_SRC_RANGE_MAP | string | Name of the LPM trie with the allowed IPv6 source CIDRs for services. | | LB6_SRC_RANGE_MAP_SIZE | u32 | Size of the LPM trie with the allowed IPv6 source CIDRs for services. | | LB_AFFINITY_MATCH_MAP | string | Name of the hashmap holding valid (service, backend) for session affinity. | | LB_DEBUG | undef, 1 | Compile in the collection of additional debug messages for LB. | | LB_DST_MAC | - | Unused | | LB_MAGLEV_LUT_SIZE | u32 | Size of the backend table per service ("M" parameter). | | LB_REDIRECT | - | Unused | | LOCAL_DELIVERY_METRICS | undef, 1 | Compile in counter updates for all traffic forwarded on the node. | | LOCAL_NODE_ID | 6 | Alias for REMOTE_NODE_ID when sending HOST_ID traffic through the tunnel. | | LXC_ID | u32 | Endpoint ID for the program's endpoint. | | LXC_IPV4 | u32 | IPv4 address of the program's endpoint. | | METRICS_MAP | string | Name of the map for datapath metrics. | | METRICS_MAP_SIZE | u32 | Size of the map for datapath metrics. | | MONITOR_AGGREGATION | 0, 1, 3 | Monitor aggregation level. 0 disables aggregation. | | MTU | u32 | Generic MTU detected by the agent. | | NAT46_PREFIX | union v6addr | IPv6 prefix to represent NATed IPv4 addresses. | | NATIVE_DEV_MAC_BY_IFINDEX | macro | Macro to get the MAC address of an interface from its index. | | NODE_MAC | | | | NODEPORT_NEIGH4 | string | Name of the IPv4 neighbor map. | | NODEPORT_NEIGH4_SIZE | u32 | Size of the IPv4 neighbor map. | | NODEPORT_NEIGH6 | string | Name of the IPv6 neighbor map. | | NODEPORT_NEIGH6_SIZE | u32 | Size of the IPv6 neighbor map. | | NODEPORT_PORT_MAX | u32 | Minimum port to consider for NodePort requests. | | NODEPORT_PORT_MAX_NAT | u32 | Lower bound of the port range used for SNAT when needed. | | NODEPORT_PORT_MIN | u32 | Maximum port to consider for NodePort requests | | NODEPORT_PORT_MIN_NAT | u32 | Upper bound of the port range used for SNAT when needed. | | NO_REDIRECT | undef, 1 | Skip the bpf_redirect from bpf_host to the lxc devices. | | POD_ENDPOINT | - | Unused | | POLICY_CALL_MAP | string | Name of the program array mapping endpoint IDs to their BPF program. | | POLICY_MAP | string | Name of the hashmap holding the policy rules for the program's endpoint. | | POLICY_MAP_SIZE | u32 | Size of the hashmap holding the policy rules for the program's endpoint. | | POLICY_PROG_MAP_SIZE | u32 | Size of the program array mapping endpoint IDs to their BPF program. | | POLICY_VERDICT_LOG_FILTER | u32 | Bitmask to indicate which policy verdicts to send to userspace. | | POLICY_VERDICT_NOTIFY | undef, 1 | Compile in policy verdict notifications. | | REMOTE_NODE_ID | 6 | Identifier for remote node entities. | | REQUIRES_CAN_ACCESS | - | Unused | | SECLABEL | u32 | Security identity for the program's endpoint. | | SECLABEL_NB | u32 | Security identity for the program's endpoint, in network byte order. | | SIGNAL_MAP | string | Name of the perf event map used to signal filled up maps to the agent. | | SKIP_CALLS_MAP | undef, 1 | Compile out the program array for internal tail calls. | | SKIP_ICMPV6_ECHO_HANDLING | undef, 1 | Compile out the ICMPv6 echo handling code. | | SKIP_ICMPV6_HOPLIMIT_HANDLING | undef, 1 | Compile out the ICMPv6 HopLimit handling code. | | SKIP_ICMPV6_NS_HANDLING | undef, 1 | Compile out the ICMPv6 NS handling code. | | SKIP_POLICY_MAP | undef, 1 | Compile out the program array for tail calls between endpoints. | | SKIP_UNDEF_LPM_LOOKUP_FN | undef, 1 | Compile out the surrogate, hashmap-based function for ipcache lookups. | | SNAT_MAPPING_IPV4 | string | Name of the hashmap with translations for (rev-)SNAT of an IPv4 connection. | | SNAT_MAPPING_IPV4_SIZE | u32 | Size of the hashmap with translations for (rev-)SNAT of an IPv4 connection. | | SNAT_MAPPING_IPV6 | string | Name of the hashmap with translations for (rev-)SNAT of an IPv6 connection. | | SNAT_MAPPING_IPV6_SIZE | u32 | Size of the hashmap with translations for (rev-)SNAT of an IPv6 connection. | | SOCK_OPS_MAP | string | Name of the sockmap for fast socket-level redirects. | | SOCKOPS_MAP_SIZE | u32 | Size of the sockmap for fast socket-level redirects. | | SYS_PROCEED | 1 | Allow the system call to proceed. | | SYS_REJECT | 0 | Deny the system call. | | TEMPLATE_HOST_EP_ID | 0xffff | Placeholder host endpoint ID for the template file. | | THROTTLE_MAP | string | Name of the hashmap with the information to rate-limit egress traffic. | | THROTTLE_MAP_SIZE | u32 | Size of the hashmap with the information to rate-limit egress traffic. | | TRACE_NOTIFY | undef, 1 | Compile in packet trace event collection (to-network, from-host, etc.). | | TUNNEL_ENDPOINT_MAP_SIZE | u32 | Size of the hashmap mapping destination IPs to their tunnel endpoint IPs. | | TUNNEL_MAP | string | Name of the hashmap mapping destination IPs to their tunnel endpoint IPs. | | UNMANAGED_ID | 3 | Identifier for unmanaged entity. | | WORLD_ID | 2 | Identifier for world entity. |