Use `FUNC_RET_READ` for `inet_csk_accept`

fearful-symmetry commented 2 months ago

closes https://github.com/elastic/ebpf/issues/200

This changes the inet_csk_accept fexit probe to use FUNC_RET_READ instead of passing args to the method; this means we don't have to deal with the function arguments changing, like with the 6.10 kernel.

Tested on Fedora with 6.10.3, but we seem to be missing the addr fields. Not sure if this is a quirk of using the fexit probe, or if there's a bug here and the inet_csk_accept method no longer works the way we assume.

sudo ./EventsTrace --net-conn-accept -i                                                                                                                                                   
{"probes_initialized": true, "features": {"bpf_tramp": true}}
{"event_type":"NETWORK_CONNECTION_ACCEPTED","pids":{"tid":4184470,"tgid":4184470,"ppid":4145903,"pgid":4184470,"sid":4145903,"start_time_ns":1468277566951535},"net":{"transport":"TCP","family":"AF_INET","source_address":"0.0.0.0","source_port":8000,"destination_address":"0.0.0.0","destination_port":0,"network_namespace":4026531840},"comm":"python3"}

haesbaert commented 2 months ago

Not sure I grok this but, won't this break for multiple cpus? Two cpus entering the same function would basically store the return value in the same storage space. I guess the magic is in the ctx arithmetic, where does that come from?

fearful-symmetry commented 2 months ago

@haesbaert hmm, good point, didn't think of that. @stanek-michal do you know how this will work across CPUs?

fearful-symmetry commented 2 months ago

Hmm. Tested with bpftrace via kretfunc, the addresses show up fine:

Tracing TCP accepts. Hit Ctrl-C to end.
TIME     PID    COMM           RADDR                                   RPORT LADDR                                   LPORT BL
11:33:39 4184470 python3        192.168.1.124                           54570 192.168.1.37                            8000  0/5

fearful-symmetry commented 2 months ago

Using the branch where I rewrote the probes to use an updated vmlinux, it works fine:

sudo ./EventsTrace --net-conn-accept -i                                                                                                              
{"probes_initialized": true, "features": {"bpf_tramp": true}}
{"event_type":"NETWORK_CONNECTION_ACCEPTED","pids":{"tid":4184470,"tgid":4184470,"ppid":4145903,"pgid":4184470,"sid":4145903,"start_time_ns":1468277566951535},"net":{"transport":"TCP","family":"AF_INET","source_address":"192.168.1.37","source_port":8000,"destination_address":"192.168.1.124","destination_port":54728,"network_namespace":4026531840},"comm":"python3"}

I guess FUNC_RET_READ is behaving in some unexpected way here?

haesbaert commented 2 months ago

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

haesbaert commented 2 months ago

with the rellocation addition it's correct I believe.

stanek-michal commented 2 months ago

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

haesbaert commented 2 months ago

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:

int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)

So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret). Problem before is we were doing ctx + 0 which is R1.

stanek-michal commented 2 months ago

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:
int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)
So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret). Problem before is we were doing ctx + 0 which is R1.

Oh interesting. I guess the ctx (which seems to be struct 'bpf_context') can be different things in different cases, like for network stuff its the packet data etc.

fearful-symmetry commented 2 months ago

Not really sure what's going on with the CI failures, can't see anything other than this:

make: *** [Makefile:171: run-multikernel-test] Error 1

haesbaert commented 2 months ago

So I was wrong, ctx + 0 is supposed to be where the return address is, but I can't seem to find the layout of ctx.

Yeah, its eBPF's R0 which is the return value, the layout I guess is just the eBPF registers (like pt_regs but for eBPF arch)

Not sure that's quite right, if I understand this now, it does ctx + patched_return_index, which is the last register in the return function:
int BPF_PROG(
    fexit__inet_csk_accept, struct sock *sk, int flags, int *err, bool kern, struct sock *ret)
So the patched_return_index here would be 4, meaning use R5 (R1=sk; R2=flags; R3=err; R4=kern; R5=ret). Problem before is we were doing ctx + 0 which is R1.
Oh interesting. I guess the ctx (which seems to be struct 'bpf_context') can be different things in different cases, like for network stuff its the packet data etc.

I guess so as well, I'll try to read the bpf code later and see if I can confirm.

fearful-symmetry commented 2 months ago

are the tests just....timing out?

2024-08-27T17:42:18.7669228Z [            ] stdout: timed out waiting for EventsTrace to get ready, dumped stderr above
2024-08-27T17:42:18.7669408Z [            ] stdout: ===== STACKTRACE FOR FAILED TEST =====
2024-08-27T17:42:18.7669543Z [            ] stdout: goroutine 1 [running]:
2024-08-27T17:42:18.7669709Z [            ] stdout: main.TestFail({0x400088fd98, 0x1, 0x1})
2024-08-27T17:42:18.7670123Z [            ] stdout:     /home/runner/work/ebpf/ebpf/testing/testrunner/utils.go:292 +0xe8
2024-08-27T17:42:18.7670417Z [            ] stdout: main.(*EventsTraceInstance).Start(0x4000012080, {0x13d168, 0x400008c240})
2024-08-27T17:42:18.7670761Z [            ] stdout:     /home/runner/work/ebpf/ebpf/testing/testrunner/eventstrace.go:87 +0x360
2024-08-27T17:42:18.7670983Z [            ] stdout: main.RunEventsTest(0x11dd40, {0x400009df38, 0x1, 0x1})
2024-08-27T17:42:18.7671283Z [            ] stdout:     /home/runner/work/ebpf/ebpf/testing/testrunner/utils.go:350 +0x190
2024-08-27T17:42:18.7671395Z [            ] stdout: main.main()
2024-08-27T17:42:18.7671684Z [            ] stdout:     /home/runner/work/ebpf/ebpf/testing/testrunner/main.go:30 +0x2dc

elastic / ebpf

Use `FUNC_RET_READ` for `inet_csk_accept` #201