eunomia-bpf / bpftime

Userspace eBPF runtime for Observability, Network & General Extensions Framework
https://eunomia.dev/bpftime/
MIT License
789 stars 74 forks source link

bpftime trace: tracepipe support for bpf_printk #253

Open aneeshdamle11 opened 6 months ago

aneeshdamle11 commented 6 months ago

Description

bpf_printk helper currently outputs to the target program. The kernel eBPF behaviour outputs the result to a named tracepipe (/sys/kernel/debug/tracing/trace_pipe). This PR tries to emulate this behaviour: bpf_printk outputs to a tracepipe, this pipe is accessed by "bpftime trace" option. The trace pipe is created at "$HOME/.bpftime/tracepipe" or "(bpftime installation location) + /tracepipe" Dependency: bpftime trace needs to be run in a separate terminal before running bpftime start option.

Fixes #157

Type of change

How Has This Been Tested?

Test Configuration:

Checklist

aneeshdamle11 commented 6 months ago

I have tested this feature thus: I opened 3 terminals, in given order:

  1. bpftime load ./example/malloc/malloc
  2. bpftime trace
  3. bpftime start ./example/malloc/victim

Below are the output screenshots:

  1. bpftime load bpftime_load

  2. bpftime trace bpftime_trace

  3. bpftime start (called 3 times, screenshot of 2 calls due to screenshot space constraints) bpftime_start

Officeyutong commented 6 months ago

Thank you! But note that we also need to keep the ability to get bpf_trace_printk prints things to stdout. don't remove such feature

aneeshdamle11 commented 6 months ago

Okay, could you help with clarifying these doubts before I make the changes in the next commit?

  1. Should the continuous reading of tracepipe be carried out in the bpftime load option (like load_sock_ops in previous Linux versions ref:https://elixir.bootlin.com/linux/v4.18/source/samples/bpf/load_sock_ops.c#93)? Then, the bpf_printk would output contents to the "bpftool load" process. OR
  2. Should the feature work thus: load, start, attach work as they normally should (Here, bpf_printk outputs things to tracepipe, and is not seen in the target output). When "bpftool trace" option is called, the tracepipe contents are read and printed. It would help immensely if you could clear these doubts, thank you!
Officeyutong commented 6 months ago

Okay, could you help with clarifying these doubts before I make the changes in the next commit?

  1. Should the continuous reading of tracepipe be carried out in the bpftime load option (like load_sock_ops in previous Linux versions ref:https://elixir.bootlin.com/linux/v4.18/source/samples/bpf/load_sock_ops.c#93)? Then, the bpf_printk would output contents to the "bpftool load" process. OR
  2. Should the feature work thus: load, start, attach work as they normally should (Here, bpf_printk outputs things to tracepipe, and is not seen in the target output). When "bpftool trace" option is called, the tracepipe contents are read and printed. It would help immensely if you could clear these doubts, thank you!

I think we could use a option such as --print-to-trace-pipe, -p to control whether bpf_printk should print things to trace pipe. If the option is not set, bpf_printk should print things to stdout of the injected program. if set, it should print things to a pipe, and we can use bpftime trace to read things from it.

The option could be set through bpftime start, e.g bpftime start -p ./victim. We can use environment variable to pass the option to libbpftime-agent.so.

For further, we could also support set that option when using bpftime load. But in this way the option should be stored in shared memory, and agent should load that option from shared memory and determine where to print things printed by bpf_printk

aneeshdamle11 commented 6 months ago

The code now runs with --print-to-trace-pipe option, or -p option for bpftime start, and normally printing on the target program's stdout otherwise. "TRACEPIPE_PATH=" is the environment variable

Test results: bpftime load ./example/malloc/malloc run normally in a separate terminal. bpftime trace: bpftime_trace2 bpftime start -p ./example/malloc/victim bpftime_start2

Officeyutong commented 6 months ago

Thank you! But please make the CI happy

aneeshdamle11 commented 6 months ago

I tried poring over the code, and tried some checks. But I'm not able to point out the exact check fail. Could you please help in pointing out some areas in code that could be the cause?

Officeyutong commented 6 months ago

I tried poring over the code, and tried some checks. But I'm not able to point out the exact check fail. Could you please help in pointing out some areas in code that could be the cause?

You may try running target bpftime_runtime_tests on your local machine. In yout newest commit, this check fails

aneeshdamle11 commented 6 months ago

The check was failing at normal bpftime attach. The error was due to direct conversion from char * getenv to string. On adding a separate condition, the attach worked properly.

aneeshdamle11 commented 6 months ago

Now there's a different failing check. On checking a different PR where this check passes, the following command is run:

image

When I run it on my Ubuntu machine, it gives the following success output: Specs: Kubuntu 23.10, kernel 6.5.0-26-generic image

Since I am not able to regenerate the issue, I am at a blocker. Could you please help with any inputs on what the error might be, or what I should concentrate on correcting?

aneeshdamle11 commented 6 months ago

Also, I forgot to mention one thing: when bpftime start is used with the --print-to-trace-pipe option, it has to have bpftime trace running already, or else the bpf_printk output might block. Would this be an issue; should I document it?

Officeyutong commented 6 months ago

Also, I forgot to mention one thing: when bpftime start is used with the --print-to-trace-pipe option, it has to have bpftime trace running already, or else the bpf_printk output might block. Would this be an issue; should I document it?

It should be documented, or the better is that it could be fixed

Officeyutong commented 6 months ago

I'm investigating on that. seems that the CI script (run_example.py) was stuck at somewhere pending for IO, I guess it would be related to the flush of stdio fds (When executed by run_example.py, stdout/stderr seems necessary to be manually flushed, like https://github.com/eunomia-bpf/bpftime/blob/903c95199da424219e9a4042e718c4f252bfdeae/example/malloc/malloc.c#L72). I need more time to figure out what's the root cause, but you may be also try to flush stdout/stderr

Kailian-Jacy commented 4 months ago

Gone through the code in PR, it seems modifications has nothing to do with stdio during the test example.

I tried to run test locally, printing debugger told me the program hangs on the line https://github.com/eunomia-bpf/bpftime/blob/f8db2dc52671fc62f17960708b7c00db84eb8d54/.github/script/run_example.py#L106 And at that time these two proc-s have exited according to htop.

image

Are we sure that communicate() with exit process really works?