Open Andreagit97 opened 9 hours ago
@Molter73 this is the same issue we talked about some time ago
We had a fix for this that I thought we had already upstreamed, sorry this fell through the cracks, it's pretty close to your PR though: stackrox/falcosecurity-libs#82
If I recall correctly, @erthalion looked into it and COS is compiling the kernel with clang, which has some additional safety annotations that are ignored by GCC and cause this verifier issue, which also matches your analysis.
Oh, that explains why __rcu
markers are considered in COS and not on other kernels, thank you for the info!
For what concern the proposed fix, they are almost identical, i avoided the extra null check since BPF_CORE_READ_INTO
should do it for us, if unsafe_ptr
is 0 + something
, copy_from_kernel_nofault
will fail because this is not a kernel address and so the output will be memset to 0
, and again and again until we end the iterations of BPF_CORE_READ_INTO
bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr)
{
int ret = -EFAULT;
if (IS_ENABLED(CONFIG_BPF_EVENTS))
ret = copy_from_kernel_nofault(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
memset(dst, 0, size);
return ret;
}
Describe the bug
Running the modern bpf probe on:
1.31.1-gke.1678000
,RAPID
channelcos-beta-117-18613-0-66
We face the following verifier error
More in detail
There is a problem when we try to access the
task->cred
field. Looking at the same program (capset_x
) loaded on another kernel (6.8.0-45-generic #45~22.04.1-Ubuntu
) we obtain the following register stateSo as you can see the same field
cred
is seen as a simpleptr
in the Ubuntu kernel while on COS this is seen asrcu_ptr_or_null_
and so we hit the following verifier branchNow the reason why COS is changing this type resides probably in how the type
cred
is marked in the kernel BTFFor some reason in COS we enter the RCU branch and once the
MEM_RCU
flag is set we also acquire thePTR_MAYBE_NULL
flag as we can see aboveI will propose a possible fix for this in the short term but we should look into why this is happening on the COS kernel and if this is a bug or an intended behavior.