google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.55k stars 1.28k forks source link

Configurable probes #4805

Closed ianlewis closed 1 year ago

ianlewis commented 3 years ago

This is a top level issue for "configurable probes" that can be used for observability and security tracking/reporting.

tanjianfeng commented 3 years ago

This is interesting. We are always trying to figure out a way to do dynamic instrucmentation for tracing, etc. Any rough idea?

cc @zhuangel @btw616

avagin commented 3 years ago

We have a few requests from internal and external users. There are two main types of use-cases. The first one is threat detection tools that need to trace events and set up custom security policies. The second one is networking that includes packet filtering and traffic control.

I think the the main question here whether we want to be able to handle events in a sentry address space or not. I think we need to have this option. For example, if we want to filter/processing network packets, we will want to do this with minimal latency. In this case, we need to implement a DSL and a sanbox to run DSL programs. And we want to be compatible with Linux, so we consider implementing eBPF in gVisor.

The difference with Linux is that gVisor will:

On top of BPF, we will be able to implement a more simple framework for users who don’t need the whole power of BPF.

@tanjianfeng, we are still on a stage of designing this feature, so any feedback or ideas are welcome.

tanjianfeng commented 3 years ago

The first one is threat detection tools that need to trace events and set up custom security policies. The second one is networking that includes packet filtering and traffic control.

These two use cases are also in our mind. Add another use case: to debug latency issue (similar to https://golang.org/doc/diagnostics.html#tracing, but more universal).

Except what we can do @avagin mentioned above, where we can put those hooks is as important. However, Go does not provide such function now:

Is there a way to automatically intercept each function call and create traces?

Go doesn’t provide a way to automatically intercept every function call and create trace spans. You need to manually instrument your code to create, end, and annotate spans.

kevinGC commented 3 years ago

Supporting eBPF will be a major undertaking given the number of helper functions and degree to which they are tied to kernel-specific structures: https://man7.org/linux/man-pages/man7/bpf-helpers.7.html. @tanjianfeng I'm curious whether you're interested in eBPF specifically, or whether the "more simple framework" would suffice?

hbhasker commented 3 years ago

@tanjianfeng Btw runsc already does support collecting traces using the docker debug port. Similar to how you can get cpu/heap profiles.

tanjianfeng commented 3 years ago

Supporting eBPF will be a major undertaking given the number of helper functions and degree to which they are tied to kernel-specific structures: https://man7.org/linux/man-pages/man7/bpf-helpers.7.html. @tanjianfeng I'm curious whether you're interested in eBPF specifically, or whether the "more simple framework" would suffice?

Personally, I'm not interested in eBPF here.

eBPF is a way to write instrumention code, with the help of helper functions. Linux needs it to keep its own safe. Considering that gVisor is an application kernel, and on the assumption that this tool is for infrastructure provider, instead of application users, I think we don't need that strict requirement. We can just write the instrumentation code with Go, and generate it as a hotfix or Go plugin. It can be much more flexible than eBPF.

tanjianfeng commented 3 years ago

@tanjianfeng Btw runsc already does support collecting traces using the docker debug port. Similar to how you can get cpu/heap profiles.

I suppose you are saying "runsc debug -trace". It gives information like sched/lock/... But we may need hooks like:

avagin commented 3 years ago

eBPF is a way to write instrumention code, with the help of helper functions. Linux needs it to keep its own safe. Considering that gVisor is an application kernel, and on the assumption that this tool is for infrastructure provider, instead of application users, I think we don't need that strict requirement. We can just write the instrumentation code with Go, and generate it as a hotfix or Go plugin. It can be much more flexible than eBPF.

Our use-cases are a bit different. We want to allow third party apps to use probes. This means that we want to be sure that they will not corrupt anything in the gvisor memory and a second thing is that probes will be built separately.

tanjianfeng commented 3 years ago

Our use-cases are a bit different. We want to allow third party apps to use probes. This means that we want to be sure that they will not corrupt anything in the gvisor memory and a second thing is that probes will be built separately.

Hmm, that's a higher requirement: it needs users to understand gVisor.

fvoznika commented 3 years ago

I think there are a few different use-cases that could benefit from the same underlying infrastructure. They have different requirements and will likely have different APIs.

Auditing

These are hand selected events that need to occur at the right place, e.g. after a process is created, but before it executes. They need to be stable and considered part of the API. Removal and changes are considered breaking changes. There needs to be a mechanism to collect extra data based on configuration, e.g. collecting file path, PID, container ID, etc.

In general, the processing of the events are done asynchronously outside the sandbox. There needs to be a mechanism to connect an event stream with an outside endpoint (pipe, UDS) and lifecycle management may be needed (e.g. create/kill monitoring process on sandbox creation/deletion).

There are also some interesting questions about whether auditing events should allow actions to be blocked. Let's say it needs to block anyone who tries to execute a shell. Or whether other more interesting actions should be allowed, like taking a snapshot of the sandbox for forensic analysis when certain event occurs. This requires event processing to be synchronous and communication between sandbox and event consumer to be fast (would likely use a form of RPCs with flipcall and futex_swap).

All of the above could be achieved with static events, a good amount of options to collect Sentry data, and a flexible configuration. But there is no need for DSL.

Debug tracing

In most cases, these are enter/exit events from functions. I would love to have a mechanism to instrument the Sentry without having to litter it with dozens of log.Debugf() statements by hand. It requires a way to inject event firing code dynamically into functions, akin to dtrace. It can capture all arguments, with special handling for known types, and can be configured to collect more data similar to Audit events, depending on the arguments (e.g. get PID from context.Context).

The ability to filter events can help reduce the overhead of tracing and help a lot with debugging. For example, it could log only events for a given process, or container, or only log errors, etc. Although not strictly necessary, debug tracing would benefit from having a DSL. The DSL here doesn't need to be as restrictive as eBPF though and should be easy to write (why not Go? :-) ).

The event stream is written to a file. No sync processing required.

Network filtering

Network filtering has security implications. In the event that an attacker breaks inside the Sentry, rules set in the Sentry can be bypassed. Therefore, for security purposes we assume that the Sentry has been compromised and never trust it. This makes probes not suitable for network filtering. Filtering needs to be done outside the sandbox, either in the host or using a network gofer to proxy network connections, similar to how fsgofer proxies filesystem access.

fvoznika commented 2 years ago

We have made some progress defining the infrastructure to support probes. The seccheck package in the Sentry has some basic infrastructure to support adding points of interest in the Sentry and publishing them to interested parties. This design doc describes our current thinking on how to implement it for runsc. It's also important to have real use cases to ensure the implementation is sound. We chose to use Falco as the first implementation and proof of concept. The Falco team has been helping us define the interface and get the design right.

We welcome feedback from everyone directly to the doc or here.

fvoznika commented 1 year ago

This is now complete. Further improvements can be tracked in a separate issue.

github-actions[bot] commented 1 year ago

There are TODOs still referencing this issue:

  1. test/trace/trace_test.go:112: Add validation for these messages.

Search TODO

fvoznika commented 1 year ago

This is now complete. Further improvements can be tracked in separate issues.