giuseppe / easyseccomp

DSL language to write seccomp filters
GNU General Public License v2.0
35 stars 2 forks source link

easyseccomp: support $syscall in KERNEL(VERSION) #2

Closed giuseppe closed 3 years ago

giuseppe commented 3 years ago

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

giuseppe commented 3 years ago

@cyphar I've added a hacky implementation of kernel-version, the script to generate the data is quite raw and I'll drop it once this information is available in libseccomp: https://github.com/giuseppe/easyseccomp/pull/2/commits/05a321a2812a18442120a0d1abab936dec000a0e#diff-b5486e1d29999479e9118e2750310f547832d463f86aca4bb427bd15babaefbd

I'll also drop the generated files before merging, it is just to show how they look like

With this feature it is possible to do something like:

.... other rules...

$syscall in KERNEL(5.3) => ERRNO(EPERM);
=> ERRNO(ENOSYS);

What do you think?

cyphar commented 3 years ago

Okay, so $syscall in KERNEL(X.Y) lets you generate syscall-vintage-based -EPERM rules. That is quite nice, and because easyseccomp is written more procedurally you can actually implement it as a proper fallback rule rather than front-loading it like I did in runc. However (purely from the spec side of things) I am a little worried if we add runtime-spec support for custom BPF filters, we can't entirely trust them to handle -ENOSYS correctly.

(Sorry I didn't look at this before you merged -- but yeah, it seems like a reasonable way of doing it and is nice that it is explicit, though I think that the maximum kernel version can still be implicitly determined in most cases.)

cyphar commented 3 years ago

But I might take a look at the scripts you used to generate kernel version information so we can include that information in libseccomp...

giuseppe commented 3 years ago

But I might take a look at the scripts you used to generate kernel version information so we can include that information in libseccomp...

they are not the nicest :-)

The arch is completely ignored, a syscall is considered present in a kernel version if it is defined for any arch. Probably it is fine for what it is supposed to solve with the $syscall in KERNEL(...) functionality. Also, syscalls are still looked up and filtered with libseccomp. This part can probably be polished and use the syscalls in the last kernel version.

The current containers/common seccomp policy in the format looks like: https://github.com/giuseppe/easyseccomp/blob/main/contrib/default-policy.easyseccomp

I've also changed the crun annotation implementation to use a base64 embedded BPF filter as you suggested, that is easier to handle both for the OCI runtime and to integrate into the existing tools: https://github.com/giuseppe/libpod/tree/easyseccomp