seccomp / libseccomp

The main libseccomp repository
GNU Lesser General Public License v2.1
791 stars 170 forks source link

RFE: add support for comparisons against 32-bit arguments #383

Open jhenstridge opened 2 years ago

jhenstridge commented 2 years ago

This issue came up while investigating a problem in the seccomp filters generated by snapd using libseccomp. We had a filter set up to allow calling the copy_file_range syscall provided that the sixth argument was 0. This was done through the Go binding with a condition equivalent to seccomp.MakeCondition(5, seccomp.CompareEqual, 0). The filter worked fine for some programs using the syscall, but failed for others.

Running the failing programs under strace showed that they were passing 0 as the flags argument, and the kernel wasn't complaining about the calls when not run under the filter (it will currently return EINVAL for any value of flags != 0). The underlying cause is that the sixth argument is an unsigned int, which is 32-bits wide on x86-64 systems, so the top half of the register used to pass the argument is unused, and may contain garbage data from whatever the register was previously being used for. This garbage data is seen by seccomp, so the filter program needs to know to ignore the high 32 bits of that argument.

I was able to work around this in https://github.com/snapcore/snapd/pull/11760 by converting the condition to seccomp.MakeCondition(5, seccomp.CompareMaskedEqual, 0xFFFFFFFF, 0), but would have been out of luck if I was using any of the other comparison operators. And while this worked, it generates a less efficient program: it's loading all 8 bytes of the argument, apply a mask, and perform 2 comparisons. Instead, it could load just 4 bytes and perform a single comparison.

One possible way to implement this in an ABI compatible fashion would be to use some high bits of enum scmp_compare to indicate that a 32-bit comparison is being requested. This would require the high half of struct scmp_arg_cmp's datum_a and datum_b fields to be clear, and to only perform the comparison against the low half of the chosen argument.

The same approach could maybe be used to add support for signed comparisons (mostly an issue for the less than/greater than comparisons).

pcmoore commented 2 years ago

Huh, that's fun. I wonder if this is specific to golang? The only reason I ask is that I'm guessing we would have seen more failures related to random junk in the high 32-bits of syscall args if this issue was more widespread ... or not.

Any chance you've played with this at all outside of golang?

pcmoore commented 2 years ago

I'm searching around a little on this right now, and it seems like the general guidance around syscalls is to zero-extend 32-bit values into 64-bit registers (solving the high 32-bits garbage problem), but I haven't yet found a spec/doc that states this explicitly for the Linux syscall ABI on x86_64.

I'm growing increasingly suspicious that golang may not be doing the right thing with respect to 32-bit syscall args.

rusty-snake commented 2 years ago

Related: https://www.exploit-db.com/exploits/46594 / https://nvd.nist.gov/vuln/detail/CVE-2019-7303

jhenstridge commented 2 years ago

We were building the seccomp profiles with Go. The programs failing under the filter were not in Go. This forum thread covers a few people reporting the bug:

https://forum.snapcraft.io/t/cannot-write-in-home-directory/24436?u=jamesh

Some of the reports were NodeJS and Electron apps. I'd also managed to reproduce the failure using this short Python program using ctypes to invoke syscall() directly: https://paste.ubuntu.com/p/9XK8sft3Dc/ (stripped down from the Proton launcher script). With some versions of Python I could reliably trigger the failure while others would always pass, which sounds like it is related to garbage at the top of the register.

jhenstridge commented 2 years ago

From the look of it, the NodeJS code is doing essentially the same thing as the Python code I found, and invoking syscall() directly.

https://github.com/nodejs/node/blob/e46c680bf2b211bbd52cf959ca17ee98c7f657f5/deps/uv/src/unix/linux-syscalls.c#L223-L242

As syscall() is a varargs function, it doesn't know the types of the arguments passed to it, so it has no way to know whether the a 32-bit or 64-bit argument has been passed in the register, so passes it as is to the kernel. And as the C calling conventions don't require the caller to zero out the rest of the register when passing 32-bit args, you can see garbage.

pcmoore commented 2 years ago

Thanks for the additional info, in your original problem report you only mentioned golang which raised some suspicion.

I'll go ahead and add this to the v2.6.0 milestone, patches/PRs are always welcome!

jhenstridge commented 2 years ago

I think my hypothesis about where the garbage data in the syscall argument is coming from was slightly wrong: 32-bit loads into registers do in fact zero out the high 32-bits. The sixth argument to the system call is the seventh argument to the syscall() function (since the first is the syscall number). This overflows the registers used for function calls, so gets pushed onto the stack. So that is likely where the garbage data comes from.

Either way, I think it is still useful to be able to perform 32-bit comparisons simply so that they only rely on the data the kernel actually cares about.