google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.88k stars 1.3k forks source link

[Question] Monitoring system calls and hooks for dynamic rule execution #407

Open agamdua opened 5 years ago

agamdua commented 5 years ago

Today on the monthly call I asked a question on how to monitor system calls (in the context of anomaly detection for intrusion detection systems) in the gVisor sentry, since that is where all system calls are intercepted. I was asked to open an issue here for further discussion on the subject.

There are really two parts to this issue/discussion:

  1. I want to be able to monitor system calls in real-time to check for "intrusion detection" patterns that deviate from the norm.
  2. I want to be able to block system calls in "real-time" based on the result of a dynamic policy (perhaps leveraging anomaly detection as an example).

The eventual goal would be to enable hooks be able to act on the system call data in real time. As I learned today on the call, there seem to be use cases for asynchronous and synchronous blocking of system calls.

A few options were floated:

What I am looking for: For now, pointing me in the direction of the right places in the codebase to look at from the above mentioned options would be great. Once I am able to look at this, I will be able to figure out what the current capabilities of the system are and where to take it.

Further notes: I currently do not have specific existing tools for this monitoring in mind, I am more interested understanding what is here and leveraging that for some tooling I plan to roll on my own, but I am open to suggestions on tools that can integrate with gVisor already (or with a few tweaks). I am happy to make changes to the codebase as needed with the right guidance.

ianlewis commented 5 years ago

Hi,

I believe the mechanism that @nlacasse was referring is in pkg/sentry/unimpl/events.go. Right now there is only unimplemented syscall events and I think these are just currently sent to the user log.

The thing that was suggested in the meeting was that events for every syscall could be added to the helper functions that are used to create the Syscall objects in pkg/sentry/syscalls/syscalls.go. Basically they would write an event to the events interface like ErrorWithEvent is doing currently. These functions are invoked in pkg/sentry/syscalls/linux/linux64.go

agamdua commented 5 years ago

Hey @ianlewis - thanks for a super quick response!

Will go through these, on a quick look ErrorWithEvent seems like it follows a model that could work.

We'd have to probably add a function similar to EmitUnimplementedEvent in kernel.go, doesn't look too hard. Will poke around some more.

prattmic commented 5 years ago

In additional to unimplemented syscall events, we also have events for every syscall executed defined in https://github.com/google/gvisor/blob/master/pkg/sentry/strace/strace.proto, and sent from https://github.com/google/gvisor/blob/master/pkg/sentry/strace/strace.go#L648-L650.

I don't think that runsc exposes a way to register for these events at the moment, but it shouldn't be too hard to add to runsc debug via sentryctl.

agamdua commented 5 years ago

@prattmic thanks, that saves me the work of figuring out how to emit events for every single syscall as the use case needs. also TIL about runsc debug, thanks! 👍

agamdua commented 5 years ago

@fvoznika - your pull request is super exciting to me! I'd love to contribute too if there's any code/tests/docs you'd like me to take care of!

fvoznika commented 5 years ago

The PR will let you enable/disable strace logging on the fly. You'll need to use it together with --debug-log option to get the strace sent to the boot log.

If you're up for it, adding documentation for runsc debug --strace would be great! Parts of runsc debug are documented here.

Next steps would be to use the strace logs to prototype your solution and drive requirements, e.g. sync/async, possible actions, different syscall filters, etc.

One possible implementation is to create a client library that connects to the RPC endpoint that is present on every sandbox. The library would receive a stream of events from the sandbox, parse and send them back to the caller. Pros: the monitoring app is isolated from the unstrusted container. Cons: cross process communication is expensive.

agamdua commented 5 years ago

@fvoznika thanks for the pointers! Going to try and take a shot at the docs for sure, and experiment with the new feature too.