cilium / tetragon

eBPF-based Security Observability and Runtime Enforcement
https://tetragon.io
Apache License 2.0
3.62k stars 360 forks source link

Applied rateLimit doesn't work well for file policies (e.g., security_file_permission) #1918

Closed vladimirkus closed 5 months ago

vladimirkus commented 10 months ago

What happened?

Applied rateLimit according to the documentation. Here is the whole tracing policy:

spec:
  kprobes:
  - args:
    - index: 0
      type: file
    - index: 1
      type: int
    call: security_file_permission
    return: true
    returnArg:
      index: 0
      maxData: false
      returnCopy: false
      type: int
    selectors:
    - matchActions:
      - action: Post
        rateLimit: 1m
      matchArgs:
      - index: 0
        operator: Prefix
        values:
        - /etc/passwd
      - index: 1
        operator: Equal
        values:
        - "4"
    syscall: false

As far as I understood from the documentation this may significantly decrease count of repetitive events, however this didn't have any noticeable effect.

During the experiment I generated events with while true; do cat /etc/passwd > /dev/null; done.

Are there any constraints using rate limiting, or probably I misunderstood the concept?

Tetragon Version

1.0.1

Kernel Version

5.15

Kubernetes Version

1.24

Bugtool

No response

Relevant log output

No response

Anything else?

No response

kevsecurity commented 9 months ago

Hello

Rate limiting is per thread and your test involves starting numerous 'cat' processes one after the other, each with a different thread ID. As such, no rate limiting occurs. The idea is to limit a single thread's repeated events, not repeated events generally.

I turned the rateLimit value in the policy down to 3s for testing, and used the following C program to generate tests:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
        int fd;
        int count = 1;
        char buf[65536];
        struct stat st;

        while (1) {
                fd = open("/etc/passwd", O_RDONLY);
                if (fd < 0) {
                        printf("open failed: %s\n", strerror(errno));
                        exit(1);
                }
                if (fstat(fd, &st) < 0) {
                        printf("fstat failed: %s\n", strerror(errno));
                        exit(1);
                }
                if (read(fd, buf, st.st_size) <= 0) {
                        printf("read failed: %s\n", strerror(errno));
                        exit(1);
                }
                printf("Opened /etc/passwd: %7d\n", count);
                count++;
                close(fd);
                sleep(1);
        }
        return 0;
}

It wouldn't be difficult to add a switch to the rate limiting function to make it apply to all threads. Let me know if that would be helpful.

kevsecurity commented 9 months ago

I've updated the docs in this PR. https://github.com/cilium/tetragon/pull/1961

kevsecurity commented 9 months ago

Also, I made this PR that lets you change the scope of the rate limiting from thread to "process" or "global". LMK if it helps in your case. Thanks! https://github.com/cilium/tetragon/pull/1962

kevsecurity commented 9 months ago

These have been merged. They will be available in the next release. Otherwise, take a ci:latest tag if you want to try them out.

vladimirkus commented 9 months ago

Hello and thank you, it should be very helpful as in our case. I'm going to test it with the ci tag.

vladimirkus commented 9 months ago

I've tested it with the same configuration, just adding rateLimitScope: "global" and rateLimit: "10s" Rate limiting itself works as excpected! Though, there are a few question that I would like to ask.

kevsecurity commented 9 months ago

Thanks for testing. Rate limiting itself shouldn't be expensive, but maybe there were other changes between 1.0.0 and the ci:latest that have impacted RAM and CPU. I'm less concerned about the spikes unless they add up to a problem. Would you have a reproducer that we could try?

Regarding the numbers of bytes to use in the rate limiting, this is a balance between RAM/CPU and effectiveness. As the size increases, so does the size of the key that is used to store rate limiting info (hence RAM) and as the key is hashed, this increases the work required to look up the info (hence CPU). In the cases where rate limiting isn't indicated by the info, this ends up as a cost with little benefit; of course, where rate limiting is employed it can save CPU on processing multiple identical events in user space (including the task switching from kernel to user space), hence the trade off.

You are correct that only the first 40 bytes of paths will be compared, leading to the potential for lost info. I would recommend not using rate limiting on these kinds of arguments/hooks. It was originally intended for networking operations where 40 bytes is the size of an IPv6 tuple, but can be used anywhere where arguments would be unique within the first 40 bytes. We could make an issue to make this configurable, and potentially to specify to use the end of the argument rather than the beginning (but I think that just moves the problem). How much of a problem do you think it is?

Thanks

vladimirkus commented 9 months ago

The test that I did was the same, I run while true; do cat /etc/passwd > /dev/null; done and while true; do cat /etc/passwd > /dev/null; sleep 1; done With and without rateLimit with the policy from the topic. I understand that this test doesn't reflect the real workloads' behavior, and I would like to test it on some real cluster to be sure if there is any issue.

Regarding the lengh limit, I don't really think that this is a big issue as I agree with you. While testing this I understood that this functionality is not the best fit to filesystem events, as it was initially developed for fix-sized arguments as IP addresses or so, hence the best fit are socket events.

In our environment, however, the most volume of events is generated by security_file_permission, and we have an intention to lower the volume, but I see that it's better to think about some other approach.

kkourt commented 8 months ago

@vladimirkus based on the discussion, I've modified the title and the labels of the issue to be an enhancement rather than a bug. We could also create a new summary if that's better.

kevsecurity commented 8 months ago
  • I've noticed a different CPU and RAM usage profile for the tetragon pod. With rate limiting enabled I noticed ~30% increase in RAM usage and spikes on the CPU, up to 60% of one core, which is very unusual. Is this expected?

@vladimirkus I finally found time to do some tests. On this test system (kernel 5.15, 8G RAM, x64 Kernel VM), I found the memory usage was pretty consistent between the rateLimit and no-rateLimit scenarios (~14.5% for the first test for both, and ~1.3% for the second test for both); and I found the CPU spikes in both rateLimit and non-rateLimit scenarios for the first test (24% CPU -> 66% for rateLimit and 40% -> 92% for non-rateLimit) but no CPU spikes for the second test.

I like that the rateLimit scenario had a lower overall CPU profile (as predicted) and assume that the CPU spikes are related to Tetragon threads waking up and handling events.

If you're happy, I think we could close this issue.

vladimirkus commented 5 months ago

@kevsecurity Hello and sorry for the long pause. I had a chance to test it more on our production clusters using v1.1.0 and I can confirm the memory and cpu usage are pretty consistent this time, probably with a small overhead. And functionality works as expected. I agree that the issue should be closed, thank you for your help!