Open yunwei37 opened 6 months ago
Replacing of frida is already able through the attach manager (by subclassing)
runtime
Attach manager is relatively independent with runtime, it doesn't invoke any API of runtime. It only provide uprobe/uretprobe implementation to runtime (bpf_attach_ctx, by now). Besides, attach manager is the only part in bpftime that has strong dependency on Frida.
So if we split attach manager into a seperate target, we might gain:
What about attach_ctx class?
Is it possible to implement all uprobe/syscall related code outside of runtime target and runtime dir? If we can do that, is it a better solution?
And also, there are some attach related code in syscall transformer and syscall-server.so
What about attach_ctx class?
Is it possible to implement all uprobe/syscall related code outside of runtime target and runtime dir? If we can do that, is it a better solution?
And also, there are some attach related code in syscall transformer and syscall-server.so
The implementation of syscall trace is not compatible with attach manager(or let's call it uprobe attach manager)
So I think we might split syscall trace attach implementation into another target, just call it syscall trace attach manager. Syscall trace callbacks should be registered to this target, and this target should provide a dispatch entry(The function that text transformer would call, when syscall was captured). And rename the current attach manager to uprobe attach manager, since it's only responsible for uprobes
So Maybe we can create two targets?
The attach is a new dir under the project root.
And the user can specify which one to compile with the runtime.
Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events?
Some background:
So maybe we can have a design like this:
These codes are all in the attach dir in project root.
Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.
So Maybe we can create two targets?
- /attach/uprobe
- /attach/syscalls
The attach is a new dir under the project root.
And the user can specify which one to compile with the runtime.
Sounds good. More attach implementation could be added in the future
Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events?
Some background:
- The eBPF runtime can be embedded in a shared memory, or compile and link with other applications as extensions. The runtime is responsible for load and manage the eBPF programs in the process.
- There can be multiple eBPF attach methods at the same time, for example, uprobe and syscalls tracepoints Should be able to work together.
- one eBPF program can be attached to multiple targets or events, one event can have multiple eBPF programs attached to it.
So maybe we can have a design like this:
- One attach_manager for managing all attach_ctx or attach_targets. The attached manager can have the unique ptr ownership of the runtime.
- attach_ctx has a base class and some sub classes. For example, uprobe_attach_ctx class and syscalls_attach_ctx. The attach_ctx should be able to access the shared memory through the runtime API, and also config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel.
These codes are all in the attach dir in project root.
Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.
This also soulds good. But the uprobe attach manager
I mentioned above is only some classes that provide API to register a callback at a certain function
. It has nothing to do with any eBPF stuff. Maybe a name like uprobe_attach_impl
is more suitable for this part of code?
Shall we come up with some better name for Uprobe attach manager? Maybe attach_ctx back or attach_target, attach_events? Some background:
- The eBPF runtime can be embedded in a shared memory, or compile and link with other applications as extensions. The runtime is responsible for load and manage the eBPF programs in the process.
- There can be multiple eBPF attach methods at the same time, for example, uprobe and syscalls tracepoints Should be able to work together.
- one eBPF program can be attached to multiple targets or events, one event can have multiple eBPF programs attached to it.
So maybe we can have a design like this:
- One attach_manager for managing all attach_ctx or attach_targets. The attached manager can have the unique ptr ownership of the runtime.
- attach_ctx has a base class and some sub classes. For example, uprobe_attach_ctx class and syscalls_attach_ctx. The attach_ctx should be able to access the shared memory through the runtime API, and also config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel.
These codes are all in the attach dir in project root. Another problem is, how can we make the attach_targets config what kinds of perf events or attached targets can be mocked in userspace, while others can be passed into the kernel? If we don't want to hardcode it in the -server.so like what we did now.
This also soulds good. But the
uprobe attach manager
I mentioned above is only some classes that provide API toregister a callback at a certain function
. It has nothing to do with any eBPF stuff. Maybe a name likeuprobe_attach_impl
is more suitable for this part of code?
And the attach manager you mentioned seems to be something that is responsible for "resolving perf event (or other equivalent), and allowing a certain event to call a certain ebpf program". Did I mis-understand what you said? If not, I think this thing is more suitable for the name attach manager
, and should be split into individual targets.
But from another perspective, I still think uprobe_attach_impl
should be split into an individual target. It has little dependency to other parts of bpftime. Splitting it into an individual can make the code base clearer, and would make it more convenient for other users that only want to use the uprobe implementation by us
Yes, uprobe_attach_impl
should be split into an individual target.
Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?
I think it could be something like
uprobe_attach_impl
will inherit this and also be built into a standalone target.attach_impl
targets.And also, can we add the new attach event at load time? So it's not statically compiled.
For example, we have three kinds agent.so:
We can let the agents or user config what functionality of syscalls it wants to mock in the syscall server.so. For example, allow some perf events syscall and bpf link types to be mock or response in the syscall server.so
, while others not.
The config can be stored in the shared memory. So the new attached targets can register it.
Yes,
uprobe_attach_impl
should be split into an individual target.Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?
I think it could be something like
- attach_manager has the ownership of all the attach_impl and has the ownership of runtime. It's built into an 'object' target in cmake, and has the runtime as dependence.
- The attach_impl based class is in a header. The
uprobe_attach_impl
will inherit this and also be built into a standalone target.- The agent.so will depend on these
attach_impl
targets.Yes,
uprobe_attach_impl
should be split into an individual target.Can you describe the full dependency and classes inheritance of all the modules/cmake targets you think may be correct?
I think it could be something like
- attach_manager has the ownership of all the attach_impl and has the ownership of runtime. It's built into an 'object' target in cmake, and has the runtime as dependence.
- The attach_impl based class is in a header. The
uprobe_attach_impl
will inherit this and also be built into a standalone target.- The agent.so will depend on these
attach_impl
targets.
What does ownership of runtime
means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)
And also, can we add the new attach event at load time? So it's not statically compiled.
For example, we have three kinds agent.so:
- one is compiled with uprobe and syscalls tracepoints enabled.
- the second is compiled only with uprobe supported,
- the third one is used statically in the application, like the nginx module or xdp in dpdk.
We can let the agents or user config what functionality of syscalls it wants to mock in the syscall server.so. For example, allow some perf events syscall and bpf link types to be mock or response in the
syscall server.so
, while others not.The config can be stored in the shared memory. So the new attached targets can register it.
This sounds good
What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)
have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.
What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)
have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.
Maps and programs are held in the shared memory, and may live longer than agent or syscall server. So maybe their ownership should not be limited by bpftime runtime?
What does ownership of runtime means? Is it something that holds all ownerships of compiled ebpf programs? (The ownership of maps are remained in the shm, and isn't held by anything, I think)
have a unique ptr in the code, and responsible for managing the open and close of the maps, compile and load the progs.
Maps and programs are held in the shared memory, and may live longer than agent or syscall server. So maybe their ownership should not be limited by bpftime runtime?
"ownership" here means "stuff in the heap memory of a certain process that is required to operate shared memory". For example, the class bpftime_shm itself
/attach/uprobe/uprobe_attach_impl
/attach/syscall/text_segment_transformer
. Provide function to register a callback for syscall invocations./attach/syscall/syscall_attach_impl
/attach/base_attach_impl
event source
, which are things like load and compile an ebpf program
, attach a uretprobe to function XYZ which will call ebpf program ABC
, create an ebpf hash map
, detach uretprobe with id XXX
. With this we won't be limited to collect ebpf operations through libbpf. We can even use a nginx module as event source. Should be at /attach/event_source/base_event_source
/attach/event_source/syscall_server_event_source
attach_event_source
, and drive attach impls to do the underlying things. attach manager should hold the ownerships of compiled ebpf programs and corresponding virtual machines. It should also hold attach implementations. Should be in /attach/attach_manager
Is it better that you can first come up with a small example of how to use the new api to implement a new eBPF attach type (e.g. nginx module eBPF)?
Is it better that you can first come up with a small example of how to use the new api to implement a new eBPF attach type (e.g. nginx module eBPF)?
OK, I'll take it
Refer to https://github.com/eunomia-bpf/bpftime-new-api-poc for detailed POC
Other notes:
prog_handler
: The result of an instantiation of a prog handler is an instance of bpftime_prog
, which includes a ready-to-execute ebpf virtual machineperf event handler
: The result of an instantiation of a perf event handler is an instance of attach_private_data. It's an abstract base class used to pass attach data to *_attach_impl
, describing attach arguments (such as function offset of uprobe, syscall id of syscall trace, or a custom string for user-defined attach impls)link handler
: Usually the instantiation of a link handler would first lead to the instantiation of its related prog handler. After that, the process was determined by the user-registered instantiating handlers of different link type. For the built-in perf event link type, it would first instantiate the related perf event handler(to get the attach private data), then it would call *_attach_impl
s to register an native attach entry, and record the mapping from handler id to the native attach id. For the built-in uprobe_multi link type, no perf event is required. When instantiating a link of uprobe multi type, it would directly create attach_private_data
s basing on arguments stored in the link (note that the creation of uprobe multi requires the registraion of uprobe/uretprobe attach impl). When instantiating a user-defined link type, it would directly call the user-registered instantiating function
Issue Summary:
The current
bpftime
architecture intermingles code for different eBPF program types and backends, such as uprobe and syscall tracing, within the syscall-server.so. For example:bpftime load
time with thesyscall-server.so
. For instance: https://github.com/eunomia-bpf/bpftime/blob/d850554af7418f66991fff9be2363df22e2d450b/runtime/syscall-server/syscall_context.cpp#L459-L470 . It may be better to have some API that allow the backend (Attach target) to control what kind of perf event should be mocked in userspace, and how to attach centain eBPF progs to these events (eg. XDP in userspace DPDK, nginx modules, plugins).This design limits the addition of new attach backends and eBPF program types, and also complicates the codebase. A refactor is proposed to address these issues and set the stage for future enhancements.
Proposed Changes:
Decouple Syscall Server Responsibilities:
syscall-server.so
and the daemon to only handle recording syscall traces and states, such as the creation of progs, maps, and links within the kernel.Split Attach Context:
attach context
class into two distinct targets,runtime
andattach_events
runtime
should offer two types of APIs:attach_events
class to implement their own event sources.Temporary Feature Development Freeze:
Future-Proofing and Extensibility:
runtime
statically in other applications, and expanding to new domains such as GPU tracing, XDP and https://github.com/eunomia-bpf/bpftime/issues/158Rationale:
This refactor addresses fundamental design flaws that were not evident in the initial conception of
bpftime
. It aims to simplify the current codebase and prepare for more stable and scalable expansion. Although this entails a significant overhaul, it is manageable given the current code volume.Next Steps:
Call for Input:
We welcome input from the community on this proposed refactor. Any insights or suggestions, especially regarding the decoupling of components and API design, would be highly appreciated.