cloudflare / ebpf_exporter

Prometheus exporter for custom eBPF metrics
MIT License
2.08k stars 225 forks source link

A suspicious multi-thread race condition of cfs-throttling-trace.bpf.c #442

Open labilezhu opened 1 week ago

labilezhu commented 1 week ago

In examples/cfs-throttling-trace.bpf.c we use cgroup id as the key of traced_cgroups map :

SEC("usdt/./tracing/demos/cfs-throttling/demo:ebpf_exporter:cfs_set_parent_span")
int BPF_USDT(cfs_set_parent_span, u64 trace_id_hi, u64 trace_id_lo, u64 span_id)
{
    u32 cgroup = bpf_get_current_cgroup_id();
    struct span_parent_t parent = { .trace_id_hi = trace_id_hi, .trace_id_lo = trace_id_lo, .span_id = span_id };

    bpf_map_update_elem(&traced_cgroups, &cgroup, &parent, BPF_ANY);

    return 0;
}

When unthrottle, we use cgroup id as key to get from the map :

SEC("fentry/unthrottle_cfs_rq")
int BPF_PROG(unthrottle_cfs_rq, struct cfs_rq *cfs_rq)
{
    u32 cgroup = cfs_rq->tg->css.cgroup->kn->id;
    u64 throttled_ns = cfs_rq->rq->clock - cfs_rq->throttled_clock;
    struct span_parent_t *parent = bpf_map_lookup_elem(&traced_cgroups, &cgroup);

But in a multi-threaded environment. It seems not aways true. May be thread id and cgroup id be composited as key?

bobrik commented 6 days ago

I'm not following. Could you show how you would change the code or describe the scenario where it's broken?

labilezhu commented 4 days ago

I am not a kernel expert. As far as I know:

The ideal world is: cfs-throttling-trace drawio

The real world is: cfs-throttling-trace-realworld drawio