grafana / beyla

eBPF-based autoinstrumentation of web applications and network metrics
https://grafana.com/oss/beyla-ebpf/
Apache License 2.0
1.19k stars 77 forks source link

Eliminate unconnected trace spans #784

Closed grcevski closed 3 weeks ago

grcevski commented 3 weeks ago

When we initially wrote the kprobes request tracking, we used a socket filter and probes attached to sys_accept and sys_connect. We needed the accept and connect probes to match the PID information to the connection information, since process information isn't available (or reliable) in socket filters. Since then we've moved away from the socket filter implementation and we only use it as auxiliary in case kretprobes fail to run, e.g. because they are too many of them at a given time and we hit the kernel limit.

However, we still use the accept and connect probes to tell the direction of the request, e.g. is it a client or a server call. Sometimes we miss this information, it could be that the accept4 return probe didn't run (too many of them) or it's the inital request and we didn't see the accept event. When we are missing the metadata of the connection request information, we have flakiness in our data, either for tests or in high volume request situations.

Since we now use probes on tcp_sendmsg and tcp_recvmsg, telling the direction is pretty easy, and we made an earlier patch to more reliably find the client calls. Essentially, if we were ending a HTTP request and the packet that contained HTTP 1.1 was sent, then it must be a server, if it was received it must be a client. So I'm extending this idea to the request packets, if we see GET/POST/... on a sent packet, then it's a client, otherwise if it was received it's a server. This way we can reliably get the metadata if there's none set by accept and connect.

I'm going to see in a follow-up PR if I can completely eliminate the filtered_connections map and the probes on connect and accept. We may need connect for future context propagation, but I think accept can definitely go.