pixie-io / pixie

Instant Kubernetes-Native Application Observability
https://px.dev
Apache License 2.0
5.58k stars 427 forks source link

Populate local IP address from BPF space #1829

Open oazizi000 opened 9 months ago

oazizi000 commented 9 months ago

Is your feature request related to a problem? Please describe. With #1808, we collect local IP addresses from eBPF space when doing server-side tracing. For consistency, we should investigate adding support for collecting the local IP address with client-side tracing. This will provide more consistency in understanding of our tables for all users.

Describe the solution you'd like Collect and populate local IP address in any tables that contain the local IP address.

Describe alternatives you've considered Alternative solutions could try to populate it from user-space, but BPF-based collection is preferred for performance and reliability reasons.

Additional context For background see #1807 #1808 and #1809

benkilimnik commented 9 months ago

I believe we can merge http_events with tcp_stats_events using the tcp stats connector.

Ran a quick test in PxL and this seems to do the trick.

import px

# Load http events data
df = px.DataFrame(table='http_events', start_time='-300s')
df = df['time_', 'local_port', 'local_addr', 'remote_port', 'remote_addr', 'trace_role']
df = df[df.trace_role == 1]
df = df[df.local_port == -1]

# Load tcp stats data
tcp_stats_df = px.DataFrame(table="tcp_stats_events", start_time='-300s', select=["time_","local_addr", "local_port", "remote_addr", "remote_port"])

# Group by remote_addr and remote_port and aggregate to get one local_addr per remote_addr:port tuple (drop local port column, which presumably contains several local ports).
tcp_stats_df = tcp_stats_df.groupby(['local_addr', 'remote_addr', 'remote_port']).agg()

# merge on connection tuple
# note that there may be duplicates because multiple Pods configured with hostNetwork: true may be using their respective host node's IP addresses, leading them to share the same local IP addresses as those of their nodes.
merged_df = df.merge(tcp_stats_df, how='inner', left_on=["remote_addr", "remote_port"], right_on=["remote_addr", "remote_port"], suffixes=['_http', '_tcp'])

px.display(df, "http events")
px.display(tcp_stats_df, "tcp")
px.display(merged_df, "http merged with tcp")

For this to work, the TCP stats source connector needs to enabled in stirling. Perhaps it would be worth adding it to kProd?