lkrg-org / lkrg

Linux Kernel Runtime Guard
https://lkrg.org
Other
404 stars 72 forks source link

[Seccomp] Switch to refcount logic for kernels >= 5.9 #346

Open Adam-pi3 opened 4 days ago

Adam-pi3 commented 4 days ago

Starting from kernel 5.9+ function 'put_seccomp_filter' has been inlined and unavailable for hooking. However, internal not-inline function was used to mitigate the problem. Unfortunately, there is no equivalent counter-part function for new hook and the old one looks incompatible which we overlooked. This patch is switching the logic for kernels 5.9+ to custom implementation of refcount logic and it should address the issue reported as #338

How Has This Been Tested?

Verified on AlmaLinux 8.8 (4.18.0-477.27.2.el8_8.x86_64) and Ubuntu 22.04.4 LTS (6.3.0-060300-generic). Likely more comprehensive tests are needed.

solardiz commented 4 days ago

As seen from our CI builds, somehow this is trying (and failing) to use refcount_t on CentOS 7, I don't immediately see why.

solardiz commented 4 days ago

this is trying (and failing) to use refcount_t on CentOS 7, I don't immediately see why.

Oh, I do see now: we're declaring the struct type with such fields unconditionally. We should wrap this in compile-time checks like we do for the usage. Also, let's not include the users field, which we don't use and which varies across kernels versions.

Adam-pi3 commented 3 days ago

We should wrap this in compile-time checks like we do for the usage

What type of wrapping are you thinking? We are using refcount API and expecting that refcount_t is available. If that's not the case, should we use atomic?

Adam-pi3 commented 3 days ago

The reason I'm asking since atomic instead of refcount are in a very old kernels. If something is < 5.9 it should use normally exposed API, only new kernel (which guarantees to have refcount) is using new logic. It doesn't make sense that compilation is failing for kernels 3.10.xxx

Adam-pi3 commented 3 days ago

Oh now I see, wrapping the definition around the kernel version :)

solardiz commented 3 days ago

Oh, also this PR should be just one commit, not several. So next time you update, please amend the first commit and force-push. You can easily squash the second commit into the first one by marking it a fixup in git rebase -i.

solardiz commented 3 days ago

Adam, as I wrote in a review comment here:

Another option is for us to fix only the get/put inconsistency bug, and not the symbol lookup failures our users are occasionally seeing - if so, we could pair inc with __put_seccomp_filter.

I think we should do just that in this PR, which would fix #338 and only it.

Let's not try to remove the dependency on this symbol at this same time. We can approach that as a separate task/PR.