opencontainers / runc

CLI tool for spawning and running containers according to the OCI specification
https://www.opencontainers.org/
Apache License 2.0
11.59k stars 2.06k forks source link

Support Landlock LSM? #2859

Open h-vetinari opened 3 years ago

h-vetinari commented 3 years ago

With the discussions about some of the limitations of (lib)seccomp (#2151, #2735 etc.), especially in the context of determining which error to return for syscalls that may or may not be around at runtime, the following caught my eye.

I presume most people here are aware of the series of "landlock" patches that have been through 30 iterations over the last 5 years. The cover letter states:

Landlock is inspired by seccomp-bpf but instead of filtering syscalls and their raw arguments, a Landlock rule can restrict the use of kernel objects like file hierarchies, according to the kernel semantic.

I don't pretend to understand well enough if this could - in principle - solve some of the problems around the returned errors or other tricky bits in that regard, but it sounded intriguing - i.e. if that whole error dance could be avoided by letting landlock rather than seccomp handle things.

If landlock's approach is fundamentally unsuited to solving the problems here, I apologise. If OTOH there are only a few things missing for it to be usable, perhaps it would be a good time to reach out on LKML to see if they can be closed (or at least roadmapped) before the series is merged (looks like it's finalising).

cyphar commented 3 years ago

Landlock is trying to solve a separate problem and doesn't really help with the libseccomp issues we have -- Landlock is a new LSM (Linux Security Module) and it's probably better to compare it with AppArmor or SELinux. It does have some similarities with seccomp (it uses BPF -- though it uses eBPF while seccomp is restricted to cBPF) but aside from that it's not really comparable (and it doesn't replace seccomp's primary use-case -- removing kernel attack surface by disabling syscall entrypoints).

It should be noted that the issues we've had recently are purely related to libseccomp, not with seccomp itself. The solution we applied still uses seccomp, it just works around the limitations of libseccomp's filter generation.

h-vetinari commented 3 years ago

[...] aside from that it's not really comparable (and it doesn't replace seccomp's primary use-case -- removing kernel attack surface by disabling syscall entrypoints).

I was thinking that - if it performs as advertised - the availability of those syscall entry points would be irrelevant as long as one cannot do anything with them (because they access or manipulate kernel structures that are blocked by landlock) - i.e. getting EPERM for anything you try to do with them, unless they don't exist on that box (-> ENOSYS), which would give the "canonically" correct errors within the container, without having to go through the leaky abstraction of the syscall-profiles.

cyphar commented 3 years ago

Well, that is the argument behind LSMs in general (they restrict access to kernel objects). However that assumes there are no bugs in the LSM -- what seccomp does is that completely removes the ability to access kernel code that may be buggy (such as compat or old and unmaintained syscalls). LSMs simply cannot do that because they are applied so far into syscall execution that you've already run most of the code that might be buggy.

Seccomp doesn't solve many other problems that LSMs try to solve, but it does solve this problem that LSMs simply cannot. (And this isn't hypothetical, quite a few kernel 0days were blocked by seccomp policies because the buggy kernel code in question simply could not be reached by the container.)

h-vetinari commented 3 years ago

Well-understood, thanks for the explanation - at least I wasn't completely off with my understanding of the mechanics.

Would it make sense to consider using both though? That would also allow the user (or at least profile-maintainer) to specify a hybrid approach where e.g. seccomp is only used to filter out the "old and unmaintained" syscalls, and leaves the rest to the LSM.

cyphar commented 3 years ago

Well, landlock isn't upstream yet. But once it is I imagine we'd add support to it in the runtime-spec (just like we support SELinux and AppArmor today).

h-vetinari commented 3 years ago

Well, landlock isn't upstream yet.

FYI, it landed in -next a couple days ago, might make it into 5.13.

h-vetinari commented 3 years ago

Landlock PR for 5.13 was sent and honoured. 🥳

I guess the next step (after waiting a while for things to stabilize) is adding this to the runtime-spec?

cyphar commented 3 years ago

Yup, runtime-spec would be the next step but I'd need to look into what model we should use for supporting it (the AppArmor/SELinux "just give me the label" model is not ideal for a few reasons, but same goes for the seccomp model :frowning:).

h-vetinari commented 3 years ago

Landlock has now been released with Linux 5.13.

kailun-qin commented 2 years ago

Opened an runtime-spec proposal and added with a drafted version there.

Any review or comment is much welcome. Thanks a lot.

l0kod commented 2 years ago

About the Go support, @gnoack and I are working to move https://github.com/gnoack/golandlock: https://github.com/gnoack/golandlock/issues/10. This should not take long and I suggest to not use the current location/name but wait for the new one.

gnoack commented 2 years ago

About the Go support, @gnoack and I are working to move https://github.com/gnoack/golandlock: landlock-lsm/go-landlock#10. This should not take long and I suggest to not use the current location/name but wait for the new one.

This is done now :) The new location is: import "github.com/landlock-lsm/go-landlock/landlock".

BoardzMaster commented 2 years ago

Hi guys!!! I have drafted runc landlock version based on https://github.com/opencontainers/runtime-spec/pull/1111 commits by @kailun-qin for runtime-spec. I wonder if I can send it here for the discussion? Or @kailun-qin has already been working on runc landlock version? Please your opinion, guys!!

kailun-qin commented 2 years ago

Hi guys!!! I have drafted runc landlock version based on opencontainers/runtime-spec#1111 commits by @kailun-qin for runtime-spec. I wonder if I can send it here for the discussion? Or @kailun-qin has already been working on runc landlock version? Please your opinion, guys!!

Hi @BoardzMaster, great to hear that you're also working on it! Yes, internally we have a PoC implementation based on runc using go-landlock working as mentioned here: https://github.com/opencontainers/runtime-spec/pull/1111#issuecomment-904462878. We've been making cleanups and updating according to the go-landlock changes following the spec these days.

I'd like to propose a convergence on it if you're using the same approach so that the effort won't be wasted. WDYT? You can reach me at kailun.qin@intel.com and let me know if any thoughts.

kailun-qin commented 2 years ago

Hi @BoardzMaster, I've proposed my drafted version of PR here: https://github.com/opencontainers/runc/pull/3194. See if you have anything to add/converge or any thoughts. I would be happy to add co-authors.

Hi All, please kindly take a look when you get a chance. Thanks a lot!

P.S. There should be a follow-up patch for integration testing.

BoardzMaster commented 2 years ago

Hi @BoardzMaster, I've proposed my drafted version of PR here: #3194. See if you have anything to add/converge or any thoughts. I would be happy to add co-authors.

Hi All, please kindly take a look when you get a chance. Thanks a lot!

P.S. There should be a follow-up patch for integration testing.

Thank you so much @kailun-qin. I will check it.