aquasecurity / tracee

Linux Runtime Security and Forensics using eBPF
https://aquasecurity.github.io/tracee/latest
Apache License 2.0
3.61k stars 416 forks source link

Alert when a program tries to access one of Tracee's maps #617

Open yanivagman opened 3 years ago

yanivagman commented 3 years ago

As a security solution, Tracee should protect its own assets from being tampered by another program. Given a program which has enough privileges (e.g. CAP_SYS_ADMIN or CAP_BPF), it can access and even alter any of Tracee's maps. Although this kind of attack assumes an attacker which already has dangerous privileges, we can probably mitigate this attack by using one of the following LSM hooks (possibly both):

  1. security_bpf (https://elixir.bootlin.com/linux/v4.18.20/source/security/security.c#L1752)
  2. security_bpf_map (https://elixir.bootlin.com/linux/v4.18.20/source/security/security.c#L1756)

By attaching to security_bpf_map, we can monitor whenever a non-tracee program tries to access one of our maps and alert when such access happens

itaysk commented 3 years ago

it would be better to add a new event to Tracee-eBPF for monitoring the mapwhich emits the raw fd/names as event arguments. This could be useful for other purposes as well. Then we could write a rule in Tracee-Rules to detect which map was touched by which process.

TBD: how tracee-rules knows the names/fds of the maps that Tracee-eBPF created? the PID is written to a file in the output directory. the names of the maps could be written to another file as well.

As a followup we could do the same for bpf brograms

itaysk commented 3 years ago

TBD: how tracee-rules knows the names/fds of the maps that Tracee-eBPF created? the PID is written to a file in the output directory. the names of the maps could be written to another file as well.

Actually, it's not so trivial how tracee-rules can load this information and pass it to a signature cleanly. probably a better way is for the signature to read the file for itself. for this it would be have to be written in golang

rafaeldtinoco commented 3 years ago

OBS: I'm not yet doing a PR for this, just keeping a branch in my github account so you can follow.

Item 1 and 2

In kernel there are many situations where the real internal function (or syscall, or ioctl, or file_ops handling) is set by an attribute argument, just like:

    switch (cmd) {
    case BPF_MAP_CREATE:
        err = map_create(&attr);
        break;
    case BPF_MAP_LOOKUP_ELEM:
        err = map_lookup_elem(&attr);
        break;
        ...

What is (or will be) the project's preference for these cases ?

We can either have a cmd number argument OR we could have different events for each of the ebpf syscall sub-commands (which would have to be extended if new sub-commands are added). For now, I'm extending events with a single event ("security_bpf") with 2 arguments ("map_cmd" and "map_name"):

$ sudo ./dist/tracee-ebpf -l
security_bpf           [lsm_hooks]                              (int map_cmd, const char* map_name)

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf --trace comm=tcpconnect
found bpf object file at: /home/rafaeldtinoco/work/sources/upstream/tracee/tracee-ebpf/dist/tracee.bpf.5_8_0-43-generic.v0_5_1-22-g83a869d.o
TIME(s)        UID    COMM             PID     TID     RET              EVENT                ARGS
215751.135530  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.152500  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152557  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152579  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152600  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152616  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152700  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: 
215751.152719  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.153032  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: 
215751.153117  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.153285  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: sockets
215751.153361  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: ipv4_count
215751.153433  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: ipv6_count
215751.153583  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: events
215751.153601  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: tcpconne.rodata
215751.153620  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 2, map_name: 
215751.153633  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 22, map_name: 
215751.153645  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: tcpconne.bss

This gives us all the ebpf syscall sub-commands (like PROG_LOAD, OBJ_PIN, ...) events (in a single tracee event), not only the ones touching the map_names... that could be useful for re-utilization like Itay said so. The cons would be we have to interpret the ebpf syscall sub-command type based on the map_cmd variable (with logged data).

I think the idea is to keep the ebpf core as generic as it can be, right ?

Personal Note: I liked the probe context logic with event ids, how events and arguments can be expanded through the userland golang code... and the params_{types,names}_map logic. The perf event buffer serialization within save_to_submit_buf() logic also caught my attention (with the type/tag schema). I think I have the big picture now, will try to move on with Items 3 and 4 as the time allows (might be a busy week on my side as week, fyio).

Item 3

Item 4

yanivagman commented 3 years ago

In kernel there are many situations where the real internal function (or syscall, or ioctl, or file_ops handling) is set by an attribute argument, just like:

  switch (cmd) {
  case BPF_MAP_CREATE:
      err = map_create(&attr);
      break;
  case BPF_MAP_LOOKUP_ELEM:
      err = map_lookup_elem(&attr);
      break;
        ...

What is (or will be) the project's preference for these cases ?

From what I know, the attribute argument is given as part of the bpf syscall. All of the above cases should go through the lsm hooks when updating a map. Is there a specific case where that is not the case?

We can either have a cmd number argument OR we could have different events for each of the ebpf syscall sub-commands (which would have to be extended if new sub-commands are added). For now, I'm extending events with a single event ("security_bpf") with 2 arguments ("map_cmd" and "map_name"):

Specifically for map there exists security_bpf_map LSM hook. Have you tried to use it instead of security_bpf?

$ sudo ./dist/tracee-ebpf -l
security_bpf           [lsm_hooks]                              (int map_cmd, const char* map_name)

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf --trace comm=tcpconnect
found bpf object file at: /home/rafaeldtinoco/work/sources/upstream/tracee/tracee-ebpf/dist/tracee.bpf.5_8_0-43-generic.v0_5_1-22-g83a869d.o
TIME(s)        UID    COMM             PID     TID     RET              EVENT                ARGS
215751.135530  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.152500  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152557  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152579  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152600  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152616  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 18, map_name: 
215751.152700  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: 
215751.152719  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.153032  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: 
215751.153117  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 5, map_name: 
215751.153285  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: sockets
215751.153361  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: ipv4_count
215751.153433  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: ipv6_count
215751.153583  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: events
215751.153601  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: tcpconne.rodata
215751.153620  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 2, map_name: 
215751.153633  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 22, map_name: 
215751.153645  0      tcpconnect       1155212 1155212 0                security_bpf         map_cmd: 0, map_name: tcpconne.bss

This gives us all the ebpf syscall sub-commands (like PROG_LOAD, OBJ_PIN, ...) events (in a single tracee event), not only the ones touching the map_names... that could be useful for re-utilization like Itay said so. The cons would be we have to interpret the ebpf syscall sub-command type based on the map_cmd variable (with logged data).

For the security_bpf lsm hooks I would extract the cmd and attr raw values. In a future PR we can interpret these values to the corresponding command name and arguments.

I think the idea is to keep the ebpf core as generic as it can be, right ?

right

rafaeldtinoco commented 3 years ago

Before answering you, I did some more code reading, including kernels, and I would like to recall the initial theory. I think it could have some inaccurate premises, please advice me if not.

As a security solution, Tracee should protect its own assets from being tampered by another program. Given a program which has enough privileges (e.g. CAP_SYS_ADMIN or CAP_BPF), it can access and even alter any of Tracee's maps. Although this kind of attack assumes an attacker which already has dangerous privileges, we can probably mitigate this attack by using one of the following> LSM hooks (possibly both): security_bpf and security_bpf_map

Thinking the 'tampering' surface little further:

1. Would a parallel userland task, creating a BPF MAP be problematic to tracee ?

All userland tasks can only manipulate ebpf maps through the bpf syscall and those require a file descriptor for an already created bpf (with BPF_MAP_CREATE). The eBPF bytecode relocations, to access BPF maps created by userland, only occur AFTER userland has created the maps, and userland uses the eBPF ELF metadata to know what maps to create before loading the bpf obj (or after ? not sure).

Considering the 'original design' in Interacting with Maps, only if the userland task is able to get the bpf maps file descriptors, created by tracee, either through a unix-domain socket message passing OR through the bpf filesystem (when the map is pinned), it would be able to change maps contents.

So, the answer for this question seems to be NO.

Example: Probing the bpf syscall LSM hook (e.g. security_bpf_map) would just tell us whoever is creating a BPF map with the same name, but not necessarily trying (close to) to harm tracee application. Let's see:

The "security_bpf_map()" kprobe execution path:

would cover sub-commands:

Through BPF_MAP_CREATE we would only get "map_name" (from the initial bpf_attr union in the user headers). The other 2 sub-commands could be more 'dangerous' (I cover this in item (2)).

I'm not analysing here possible flaws in the logic of libbpf's (or kernel) eBPF bytecode load VERSUS map creation. If there is a small window of opportunity for a parallel task (to tracee) to create a MAP in between the time tracee creates its MAP and loads its eBPF bytecode (that will access maps that could have been created by a parallel task). It looks to me that the relocations will only be satisfied based on the file descriptor you get for the BPF_MAP_CREATE command.. so it would be hard for other process to hijack the entire logic.

2. Would a parallel userland code be capable of getting tracee's maps file descriptors ?

Getting tracee's maps file descriptors would allow a parallel userland task to tamper tracee's maps.

I don't see a way you can tell if someone changed our BPF maps by using 'map_names' (1). Our maps, after created, would only be addressable by userland (tracee) using tracee's file descriptors. Internally, in kernel, maps are addressed by their kernel address pointers only.

Independently of the technique used to 'steal' our bpf maps fds - hijacking it between bpf load/bpf map create OR getting it from bpf filesystem OR getting it through fd message passing OR ...) - idea is that we are able to say THIS APP has changed OUR BPF MAP for sure.

Question: Is there a way we can authenticate BPF MAPs contents and who authored them ?

Calls to:

would tell us if one is trying to tamper our maps by getting its file descriptors, for example. As long as we have attr.prog_id being used for our maps (and we probably have) we could hook those calls and check if one is trying to get our fds.

There are NO other kernel functions calling bpf_map_get_fd_by_id(), which means that the only hook we could have to investigate calls to it would be through 'security_bpf()' one.

Unfortunately this is only 'one possible tampering technique' and we would not be having an event saying 'COMPROMISED', we would have something like 'ATTEMPT MADE'.

There are other bpf commands to be hooked also: the ones that actually CHANGE/CONSULT the maps:

But then we would have to authenticate the change with something not accessible by other userland task. Idea is, imagining someone has tampered our BPF MAPs file descriptors, we can for sure say this wasn't added by tracee (attacker would have to tamper the file descriptor AND a 'magic key' or something like it).

THIS ITEM (2) IS THE CLOSEST TO ISSUE DESCRIPTION AFAICT

3. Would a kernel eBPF running program (any) be capable of accessing our BPF MAPS and changing values ?

I don't think so, I would have to research. Nevertheless, the approach of identifying this, if such, would be entirely different as no syscall would be needed.

Summary

How would you like me to proceed for this ? Hook into BPF_MAP_GET_FD_BY_ID and check if one is trying to get our MAPs file descriptors ? Or would you like me to have a "generic PR for bpf syscall sub-commands" even if not really addressing the issue you're describing here ?

Looking forwarding to discussing next steps...

yanivagman commented 3 years ago

Thanks @rafaeldtinoco for this thorough analysis!

  1. Would a parallel userland task, creating a BPF MAP be problematic to tracee ?

I don't think this will be a problem. As you described, a different task with different maps will have its own fds, and I don't expect a problem from this side.

  1. Would a parallel userland code be capable of getting tracee's maps file descriptors ?

This is the issue we are trying to solve in this PR

  1. Would a kernel eBPF running program (any) be capable of accessing our BPF MAPS and changing values ?

An interesting question that we might want to research for in the future.

So let's focus on point number 2. Like you suggested, changing a map's content requires a handle to it, so we shuold probably concentrate our efforts on catching an attempt to get a fd of one of our maps. To this point, tracee still doesn't pin any map, although it might do so in the future. So for now, we can probably concentrate on BPF_MAP_GET_FD_BY_ID.

In general, I think that it will be good to have both security_bpf and security_bpf_map events added to tracee. These can be added as raw events, with no detection made. The problem with adding a "raw" security_bpf event will be about how to pass the attr argument in a generic way, as it is a union. We can start by only submitting the event to userspace if the cmd is relevant for our usecase. In the future, we can expand it to support other cmds as well.

I also want to refer you to a blog which initiated my thoughts about this issue: https://www.crowdstrike.com/blog/analyzing-the-security-of-ebpf-maps/ Maybe it will also help you to focus on the problem we are trying to solve here

rafaeldtinoco commented 3 years ago

Couldn't play as hard as I wanted yet (libvirt backports and tests all week =() but.. this last commit adds 2 kprobes, like we discussed, and only sends arguments if it makes sense:

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf --trace comm=tcpconnect
TIME(s)        UID    COMM             PID     TID     RET              EVENT                ARGS
388355.758941  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5
388355.801769  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 18
388355.802040  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 18
388355.802187  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 18
388355.802301  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 18
388355.802403  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 18
388355.802597  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0
388355.802703  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5
388355.803094  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0
388355.803229  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5
388355.803513  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: sockets
388355.803658  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: ipv4_count
388355.803809  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: ipv6_count
388355.804091  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: events
388355.804175  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: tcpconne.rodata
388355.804247  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 2
388355.804305  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 22
388355.804382  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 0, map_name: tcpconne.bss
388355.804447  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 2
388355.807429  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5
388355.807754  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5
388355.808100  0      tcpconnect       3300043 3300043 0                security_bpf         map_cmd: 5

and

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf_map --trace comm=tcpconnect
TIME(s)        UID    COMM             PID     TID     RET              EVENT                ARGS
388362.521749  0      tcpconnect       3302486 3302486 0                security_bpf_map     
388362.522838  0      tcpconnect       3302486 3302486 0                security_bpf_map     
388362.523494  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: sockets
388362.523663  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: ipv4_count
388362.523890  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: ipv6_count
388362.524543  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: events
388362.524729  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: tcpconne.rodata
388362.525066  0      tcpconnect       3302486 3302486 0                security_bpf_map     map_name: tcpconne.bss
388362.530258  0      tcpconnect       3302486 3302486 0                security_bpf_map 

Since security_bpf_map is also called during bpf(BPF_MAP_CREATE), we can see both behave similarly.

In my weekend I'll dig in the prog context id and check BPF_MAP_GET_FD_BY_ID execution path:

to do the detection for get_fd_by_id() like we discussed. Best!

-rafaeldtinoco

rafaeldtinoco commented 3 years ago

In order to have a good test case for the feature, I have played a bit with BPF programs and maps pinning and there is some cool stuff to report at:

https://github.com/rafaeldtinoco/portablebpf/blob/hijack/hijack.c

This tree has 2 binaries:

Here is the output of mine binary running (based in libbpf skeleton):

$ sudo ./mine -v
Foreground mode...<Ctrl-C> or or SIG_TERM to end it.
...
libbpf: CO-RE relocating [0] struct task_struct: found target candidate [116] struct task_struct in [vmlinux]
libbpf: prog 'ksys_sync': relo #0: matching candidate #0 [116] struct task_struct.loginuid.val (0:124:0 @ offset 2912)
libbpf: prog 'ksys_sync': relo #0: patched insn #29 (ALU/ALU64) imm 2824 -> 2912
libbpf: prog 'ksys_sync': relo #1: kind <byte_off> (0), spec is [12] struct task_struct.comm (0:90 @ offset 2640)
libbpf: prog 'ksys_sync': relo #1: matching candidate #0 [116] struct task_struct.comm (0:103 @ offset 2712)
libbpf: prog 'ksys_sync': relo #1: patched insn #36 (ALU/ALU64) imm 2640 -> 2712
libbpf: pinned map '/sys/fs/bpf//events'
libbpf: pinned program '/sys/fs/bpf//kprobe_ksys_sync'
Tracing... Hit Ctrl-C to end.

And the pinned prog and map:

$ sudo ls /sys/fs/bpf/
events  kprobe_ksys_sync

and the gdb output showing access to the ebpf program info structure and the ebpf map info structure:

$ sudo gdb hijack
GNU gdb (Ubuntu 9.2-0ubuntu2) 9.2
...

(gdb) break hijack.c:84
Breakpoint 1 at 0x401373: file hijack.c, line 84.

(gdb) run
Starting program: /home/rafaeldtinoco/devel/portablebpf/hijack
Breakpoint 1, main (argc=<optimized out>, argv=<optimized out>) at hijack.c:84
84      if (fd < 0)

(gdb) p prog_info
$1 = {type = 2, id = 3244, tag = "\376\351\023\337'\263\225\273", jited_prog_len = 249, xlated_prog_len = 424, jited_prog_insns = 0, xlated_prog_insns = 0,
  load_time = 564582700970836, created_by_uid = 0, nr_map_ids = 1, map_ids = 0, name = "ksys_sync\000\000\000\000\000\000", ifindex = 0, gpl_compatible = 1,
  netns_dev = 0, netns_ino = 0, nr_jited_ksyms = 1, nr_jited_func_lens = 1, jited_ksyms = 0, jited_func_lens = 0, btf_id = 621, func_info_rec_size = 8,
  func_info = 0, nr_func_info = 1, nr_line_info = 17, line_info = 0, jited_line_info = 0, nr_jited_line_info = 17, line_info_rec_size = 16,
  jited_line_info_rec_size = 8, nr_prog_tags = 1, prog_tags = 0, run_time_ns = 0, run_cnt = 0}

name = ksys_sync as you can see.

(gdb) p map_info
$2 = {type = 4, id = 2307, key_size = 4, value_size = 4, max_entries = 24, map_flags = 0, name = "events\000\000\000\000\000\000\000\000\000", ifindex = 0,
  btf_vmlinux_value_type_id = 0, netns_dev = 0, netns_ino = 0, btf_id = 0, btf_key_type_id = 0, btf_value_type_id = 0}

name = events, as you can see.

Through the file descriptors originated from the pinned prog and map, I was able to query kernel internal structure through the bpf() syscall, as expected.


THENNNNNNNN, I discovered bpftool sub-commands - which, I confess, I did not know about, as I only used the gen skel command so far. So, instead of doing all tests in C, I can simply use bpftool to query/modify/add/remove key/values in all existing maps, from all existing running eBPF progs...

bpftool [prog|hash] [sub-command]

With tracee-ebpf running:

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf_map --trace comm=bpftool
$ sudo bpftool map list | grep hash
2308: hash  name args_map  flags 0x0
2309: hash  name bin_args_map  flags 0x0
2312: hash  name chosen_events_m  flags 0x0
2313: hash  name comm_filter  flags 0x0
2314: hash  name config_map  flags 0x0
2318: hash  name inequality_filt  flags 0x0
2319: hash  name mnt_ns_filter  flags 0x0
2320: hash  name new_pidns_map  flags 0x0
2321: hash  name new_pids_map  flags 0x0
2322: hash  name params_names_ma  flags 0x0
2323: hash  name params_types_ma  flags 0x0
2324: hash  name pid_filter  flags 0x0
2325: hash  name pid_ns_filter  flags 0x0
2327: hash  name ret_map  flags 0x0
2328: hash  name sockfd_map  flags 0x0
2331: hash  name sys_32_to_64_ma  flags 0x0
2334: hash  name traced_pids_map  flags 0x0
2335: hash  name uid_filter  flags 0x0
2336: hash  name uts_ns_filter  flags 0x0

I can get the comm_filter through its bpf map:

$ sudo bpftool map dump id 2313
key: 74 63 70 63 6f 6e 6e 65  63 74 00 00 00 00 00 00  value: 01 00 00 00
Found 1 element

And I can pin whatever I want:

$ sudo bpftool map list | grep comm
2406: hash  name comm_filter  flags 0x0

$ sudo bpftool map pin id 2406 /sys/fs/bpf/comm_filter

$ sudo ls /sys/fs/bpf
comm_filter

After pinning a map to bpf fs it is even easier to play with the eBPF maps and progs by using libbpf (or directly).

The good news is that security_bpf_map event is enough to get map_names and events whenever BPF_OBJ_GET_INFO_BY_FD is given to bpf syscall:

$ sudo ./dist/tracee-ebpf --debug --trace event=security_bpf_map --trace comm=bpftool
found bpf object file at: /tmp/tracee/tracee.bpf.5_8_0-43-generic.v0_5_1-23-gf9576cb.o
TIME(s)        UID    COMM             PID     TID     RET              EVENT                ARGS
354.745576     0      bpftool          58245   58245   0                security_bpf_map     map_name: args_map
354.745899     0      bpftool          58245   58245   0                security_bpf_map     map_name: bin_args_map
354.746079     0      bpftool          58245   58245   0                security_bpf_map     map_name: bufs
354.746254     0      bpftool          58245   58245   0                security_bpf_map     map_name: bufs_off
354.746430     0      bpftool          58245   58245   0                security_bpf_map     map_name: chosen_events_m
354.746612     0      bpftool          58245   58245   0                security_bpf_map     map_name: comm_filter
354.746787     0      bpftool          58245   58245   0                security_bpf_map     map_name: config_map
354.746969     0      bpftool          58245   58245   0                security_bpf_map     map_name: events
354.747150     0      bpftool          58245   58245   0                security_bpf_map     map_name: file_filter
354.747325     0      bpftool          58245   58245   0                security_bpf_map     map_name: file_writes
354.747504     0      bpftool          58245   58245   0                security_bpf_map     map_name: inequality_filt
354.747679     0      bpftool          58245   58245   0                security_bpf_map     map_name: mnt_ns_filter
354.747855     0      bpftool          58245   58245   0                security_bpf_map     map_name: new_pidns_map
354.748030     0      bpftool          58245   58245   0                security_bpf_map     map_name: new_pids_map
354.748205     0      bpftool          58245   58245   0                security_bpf_map     map_name: params_names_ma
354.748379     0      bpftool          58245   58245   0                security_bpf_map     map_name: params_types_ma
354.748555     0      bpftool          58245   58245   0                security_bpf_map     map_name: pid_filter
354.748730     0      bpftool          58245   58245   0                security_bpf_map     map_name: pid_ns_filter
354.748927     0      bpftool          58245   58245   0                security_bpf_map     map_name: prog_array
354.749167     0      bpftool          58245   58245   0                security_bpf_map     map_name: ret_map
354.749342     0      bpftool          58245   58245   0                security_bpf_map     map_name: sockfd_map
354.749518     0      bpftool          58245   58245   0                security_bpf_map     map_name: stack_addresses
354.749695     0      bpftool          58245   58245   0                security_bpf_map     map_name: string_store
354.749870     0      bpftool          58245   58245   0                security_bpf_map     map_name: sys_32_to_64_ma
354.750045     0      bpftool          58245   58245   0                security_bpf_map     map_name: sys_enter_tails
354.750281     0      bpftool          58245   58245   0                security_bpf_map     map_name: sys_exit_tails
354.750518     0      bpftool          58245   58245   0                security_bpf_map     map_name: traced_pids_map
354.750693     0      bpftool          58245   58245   0                security_bpf_map     map_name: uid_filter
354.750868     0      bpftool          58245   58245   0                security_bpf_map     map_name: uts_ns_filter

Lot's of events as we walked through all the existing progs and their ebpf maps, if I had "guessed" the MAP ID of tracee-ebpf task I would have generated a single event:

$ sudo bpftool map pin id 93 /sys/fs/bpf/temp

1456.009142    0      bpftool          212504  212504  0                security_bpf_map     map_name: comm_filter

I have not yet read bpftool source code to check how it gets all running progs and maps and is able to pin/unping progs/maps to bpf fs

Now, thinking out loud:

Something like:

So, basically tracee would have to save its PID and MAPS names, like initially planned, and tracee-rules signature will do the rest.

yanivagman commented 3 years ago

Indeed, bpftool is very powerfull. I'm using it to find "map leaks" when I'm adding new maps to tracee, and it works great.

The good news is that security_bpf_map event is enough to get map_names and events whenever BPF_OBJ_GET_INFO_BY_FD is given to bpf syscall

So, if I understand correctly, security_bpf_map should be enough for us to find map tamepering? Because from your previous note I understood the opposite:

There are NO other kernel functions calling bpf_map_get_fd_by_id(), which means that the only hook we could have to investigate calls to it would be through 'security_bpf()' one.

And yes, you are right here:

tracee-rules needs something to distinguish between ITSELF and OTHER tasks manipulating MAPS through "BPF_OBJ_GET_INFO_BY_FD" ebpf syscall sub-command (or anything in the "security_bpf_map()" execution path).

As descirbed above by @itaysk:

TBD: how tracee-rules knows the names/fds of the maps that Tracee-eBPF created? the PID is written to a file in the output directory. the names of the maps could be written to another file as well.

Actually, it's not so trivial how tracee-rules can load this information and pass it to a signature cleanly. probably a better way is for the signature to read the file for itself. for this it would be have to be written in golang

This will require some work, and can be done by a future PR. The current idea that I have in mind is to iterate over tracee's maps in the initBPF function right after t.bpfModule.BPFLoadObject() is called (this can be done by using libbpf's bpf_map__next()), and for each map, save it's fds. Then we need to think of a way to pass it to tracee-rules, which can be done by using a file, or maybe we should even create a special event for this. But again, we can defer this to a future PR.

rafaeldtinoco commented 3 years ago

I have used a systemtap script to play with bpftool and MAPS reading, just to check about what you have asked. When executing $ sudo bpftool map list and no maps exist we get:

---- SECURITY_BPF ----
EBPF SUB-CMD: 12 (BPF_MAP_GET_NEXT_ID)
--
 0xffffffff8c4ad670 : security_bpf+0x0/0x50 [kernel]
 0xffffffff8c201606 : __do_sys_bpf+0xb6/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END -------------

A single SECURITY_BPF() call from the BPF_MAP_GET_NEXT_ID sub-cmd. And then, I start the 'mine' binary, without tracing its bpf() syscalls for loading programs and such, after it is initialized and has its MAP set in kernel, I call:

$ sudo bpftool map list
233: perf_event_array  name events  flags 0x0
    key 4B  value 4B  max_entries 24  memlock 4096B

And we were able to read the MAP name (at very least). The tracing for that call is:

---- SECURITY_BPF ----
EBPF SUB-CMD: 12 (BPF_MAP_GET_NEXT_ID)
--
 0xffffffff8c4ad670 : security_bpf+0x0/0x50 [kernel]
 0xffffffff8c201606 : __do_sys_bpf+0xb6/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END -------------

---- SECURITY_BPF ----
EBPF SUB-CMD: 14 (BPF_MAP_GET_FD_BY_ID)
--
 0xffffffff8c4ad670 : security_bpf+0x0/0x50 [kernel]
 0xffffffff8c201606 : __do_sys_bpf+0xb6/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END -------------

---- SECURITY_BPF_MAP
 0xffffffff8c4ad6c0 : security_bpf_map+0x0/0x50 [kernel]
 0xffffffff8c1feae0 : bpf_map_get_fd_by_id+0xc0/0x130 [kernel]
 0xffffffff8c2018e3 : __do_sys_bpf+0x393/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END --------------

---- SECURITY_BPF ----
EBPF SUB-CMD: 15 (BPF_OBJ_GET_INFO_BY_FD)
--
 0xffffffff8c4ad670 : security_bpf+0x0/0x50 [kernel]
 0xffffffff8c201606 : __do_sys_bpf+0xb6/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END -------------

---- SECURITY_BPF ----
EBPF SUB-CMD: 12 (BPF_MAP_GET_NEXT_ID)
--
 0xffffffff8c4ad670 : security_bpf+0x0/0x50 [kernel]
 0xffffffff8c201606 : __do_sys_bpf+0xb6/0x710 [kernel]
 0xffffffff8c201c7a : __x64_sys_bpf+0x1a/0x20 [kernel]
 0xffffffff8cbbd6f9 : do_syscall_64+0x49/0xc0 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
 0xffffffff8cc0008c : entry_SYSCALL_64_after_hwframe+0x44/0xa9 [kernel]
---- END -------------

The only security_bpf_map() comes from bpf_map_get_fd_by_id(). So I was able to discover the map by doing:

So the call to bpftool generated events in following order:

So my initial statement is more correct in the sense that we can do pretty much everything by probing security_bpf(). Also, if probing security_bpf_map() we would be able to get a call for bpf_map_get_fd_by_id(), the call that gets the file descriptor for a existing MAP, but we would miss other important calls needed to transpass the maps, like BPF_MAP_GET_NEXT_ID (as showed in tracing).

rafaeldtinoco commented 3 years ago

On your statement:

This will require some work, and can be done by a future PR. The current idea that I have in mind is to iterate over tracee's maps in the initBPF function right after t.bpfModule.BPFLoadObject() is called (this can be done by using libbpf's bpf_map__next()), and for each map, save it's fds. Then we need to think of a way to pass it to tracee-rules, which can be done by using a file, or maybe we should even create a special event for this. But again, we can defer this to a future PR.

Yep, I'll leave that part to other PR as you suggested. I think investigation here shows what needs to be done from now on. Should I create a PR for the merge of the 2 events then ?

yanivagman commented 3 years ago

So my initial statement is more correct in the sense that we can do pretty much everything by probing security_bpf(). Also, if probing security_bpf_map() we would be able to get a call for bpf_map_get_fd_by_id(), the call that gets the file descriptor for a existing MAP, but we would miss other important calls needed to transpass the maps, like BPF_MAP_GET_NEXT_ID (as showed in tracing).

Interesting. The question is - to get the map's fd, we have to go thorugh bpf_map_get_fd_by_id(), shouldn't we? And if that is the case, isn't it enough to have security_bpf_map()?

Yep, I'll leave that part to other PR as you suggested. I think investigation here shows what needs to be done from now on. Should I create a PR for the merge of the 2 events then ?

That would be great, thanks!

rafaeldtinoco commented 3 years ago

So my initial statement is more correct in the sense that we can do pretty much everything by probing security_bpf(). Also, if probing security_bpf_map() we would be able to get a call for bpf_map_get_fd_by_id(), the call that gets the file descriptor for a existing MAP, but we would miss other important calls needed to transpass the maps, like BPF_MAP_GET_NEXT_ID (as showed in tracing).

Interesting. The question is - to get the map's fd, we have to go thorugh bpf_map_get_fd_by_id(), shouldn't we? And if that is the case, isn't it enough to have security_bpf_map()?

Sorry if I did not myself clear before, trying to measure being sufficiently prolix versus not saying enough. You are right, in order to identify some other task getting a FD to one of our maps, a kprobe to security_bpf_map() will be enough. I was also covering the prog/maps list walkthrough as something bad, thus the confusion.

Yep, I'll leave that part to other PR as you suggested. I think investigation here shows what needs to be done from now on. Should I create a PR for the merge of the 2 events then ?

That would be great, thanks!

Cool, will do it later today then.

yanivagman commented 3 years ago

Here is an idea for how we can know that a map belongs to tracee, and pass it to tracee-rules. After the bpf program is loaded, we can populate a new map (in populateBPFMaps()) that saves the map id for each one of tracee's maps (by iterating over the maps with bpf_map__next()). The bpf code can then access this map and check (in security_bpf_map()) if the map id equals one of tracee's map ids, and if it is, add an extra argument (can be boolean) that tells tracee-rules that this is a map of tracee that was accessed. WDYT?

rafaeldtinoco commented 3 years ago

I think it would work. I like it.

If one tries to tamper this new bpf map, let's say, the FIRST call to bpf(BPF_MAP_GET_FD_BY_ID), needed for the tampering, would be caught by security_bpf_map() and trigger what we need at least once. So we would always get the FIRST tamper attempt and this would be enough I suppose.