Closed pchaigno closed 3 months ago
@pchaigno Do we know where this clang version requirement is enforced? Can this requirement be perhaps be overriden (somewhere in the kernel code) for the sake of CI?
The version requirement is added with the "Bump the minimum supported version of LLVM to 13.0.1" series since v6.9.
Seems like the main rational is to address the issue where LLVM generate a libcall to __muloti4
for __builtin_mul_overflow()
, which doesn't sound like something that affects Agni AFAICT. Maybe have these commit reverted in CI?
9c1b86f8ce04
kbuild: raise the minimum supported version of LLVM to 13.0.1c6d9a4a93725
Makefile: drop warn-stack-size plugin opt22d3da073f33
x86: drop stack-alignment plugin opt19336376bda6
ARM: remove Thumb2 __builtin_thread_pointer workaround for Clang634e4ff9ffd8
arm64: Kconfig: clean up tautological LLVM version checks9a12e9a165b3
powerpc: Kconfig: remove tautology in CONFIG_COMPATde5f3984664e
riscv: remove MCOUNT_NAME workarounda38d97181271
riscv: Kconfig: remove version dependency from CONFIG_CLANG_SUPPORTS_DYNAMIC_FTRACE7d354f49b8d6
fortify: drop Clang version check for 12.0.1 or newer9feceff1d2d6
lib/Kconfig.debug: update Clang version check in CONFIG_KCOVe5efd80a9a76
compiler-clang.h: update __diag_clang() macros for minimum version bumpCommit 4-8 could be ignored if only x86_64 is used, and patch 9 could be ignored as well since LLVM 12 is being used here.
Thanks @shunghsiyu!
The LLVM version check can likely be bypassed, but I'd much prefer supporting newer LLVM versions. It won't look great going upstream if the first thing we need to do is revert a bunch of kennel commits 😞
I started working on supporting LLVM 18, but jumping straight to that version may be a bit challenging.
Thanks @shunghsiyu!
It won't look great going upstream
@pchaigno Got it.
I started working on supporting LLVM 18
Since LLVM doesn't promise backward compatibility across versions, supporting newest versions (e.g. 18) will be difficult. LLVM 15 for instance had a major change that changed all pointers (typed pointers were replaced opaque pointers). I feel starting with trying to support LLVM 13 will be easier.
Since LLVM doesn't promise backward compatibility across versions, supporting newest versions (e.g. 18) will be difficult. LLVM 15 for instance had a major change that changed all pointers (typed pointers were replaced opaque pointers).
Yep, that's exactly the issue I hit when trying to support LLVM 18.
I switched to LLVM 14 as a target instead. https://github.com/bpfverif/agni/pull/29 implements that and a couple fixes needed. I left it in draft because it's still failing on v6.7 for BPF_LSH_32 and BPF_ARSH_32, so I may need some help. The error is:
[handleStoreToPhiPtr] phiArgBBI: if.else.i.i1070.i
terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
Thank you for #29 @pchaigno!
I tried using llvm-14, and encountered the same failures. I have a temporary fix below.
run_llvm_passes.py was updated in d52bff5 to use Oz passes (that optimize for code size), instead of O1 passes. This worked in llvm-12, but not anymore when updating to llvm-14. Ostensibly, the passes bundled in "Oz" have changed in llvm-14.
The following is a patch that reverts back to using O1.
0001-llvm-to-smt-Revert-back-to-using-O1-passes.patch
I checked that this was able to generate all encodings for 5.9 and 6.7. Notably, this will increase code size (both llvm and consequently, smt), and also llvm-to-smt runtime. However, the smt encodings are still correct. And with modular encodings, this smt code size might be less of an issue.
This can be a temporary "fix" until I am able to figure out what exactly in llvm-14 fails when using Oz.
Confirming that the patch from above (i.e. reverting back to O1 builds) also works on 6.9. This needs another small patch in the verifier (snippet below).
The generate_encodings.py
script "fixed" the function __mark_reg_unknown
by deleting memset
instructions which llvm-to-smt does not handle. In the newer kernels, __mark_reg_unknown
further calls __mark_reg_imprecise
, where the actual memset happens. So the memset delete logic in generate_encodings.py
fails. This patch essentially inlines __mark_reg_imprecise
into __mark_reg_unknown
so that generate_encodings.py
is able to handle it.
0001-Remove-mark_reg_unknown_imprecise.patch
Note that there are many ways to address this (some noted below), but a patch is the quickest way.
memset
in llvm-to-smt in a special way, ~by setting the bpf_reg_state fields to 0~. memset
instructions that are descendants of __mark_reg_unknown
. We only care about tnum/u/s/min/max ~which are (and should be always) overwritten in mark_reg_unbounded
anyway~ which are updated by mark_reg_unbounded
anyway.Also, noting that commit 67420501e8 introduced a backward-edge in the verifer. This was addressed in 4c2a26fc80. So any kernels between 67420501e8 and 4c2a26fc80 should be patched if we llvm-to-smt to handle them.
Many thanks! I've pushed your commit and a patch to "fix" __mark_reg_unknown_imprecise
to https://github.com/bpfverif/agni/pull/29. I'm however still getting the following error when I run it against v6.9:
[handleLoadFromGEPPtrDerivedFromPhi] GEPMapIndices: [ 6 ]
[handleLoadFromGEPPtrDerivedFromPhi] phiConditionBoolI: sw.bb436.i.i_again.i.i_25_110
terminate called after throwing an instance of 'std::out_of_range'
what(): _Map_base::at
It does work against v6.7 (without the last commit, which I haven't generalized to all kernel versions yet).
Also, noting that commit 67420501e8 introduced a backward-edge in the verifer. This was addressed in 4c2a26fc80. So any kernels between 67420501e8 and 4c2a26fc80 should be patched if we llvm-to-smt to handle them.
:thinking: v6.9 is between 67420501e8 (v6.8-rc1~131^2~289^2~6^2~12
) and 4c2a26fc80 (v6.10-rc1~153^2~432^2~8
). Shouldn't it fail?
Note that there are many ways to address this (some noted below), but a patch is the quickest way.
Can you remind me the issue with memset? Is it that we need to have per-field accesses because the SMT are then written against specific field?
🤔 v6.9 is between 67420501e8 (v6.8-rc1131^2289^26^212) and 4c2a26fc80 (v6.10-rc1153^2432^2~8). Shouldn't it fail?
That's correct! It should fail, unless the kernel is patched with 4c2a26fc80. I patched the 6.9 kernel before running llvm-to-smt, and it succeeded in generating all encodings locally.
Can you remind me the issue with memset?
If there are memsets in the verifier.c code, such as the one in __mark_reg_unknown:
memset(reg, 0, offsetof(struct bpf_reg_state, var_off));
clang emits an llvm.memset intrinsic in the llvm code. Typically this shows up as a bitcast to an i8* followed by a memset in llvm IR:
%i.i.i2.i = bitcast %struct.bpf_reg_state* %dst_reg to i8*
call void @llvm.memset.p0i8.i64(i8* %i.i.i2.i, i8 0, i64 24, i1 false) #2
We do not support this construct (bitcast, memset) in llvm-to-smt yet: it should error out with the message:
[handleCallInst] bitcast instruction not supported
As such, the above memset clears everything until the var_off field in a bpf_reg_state. So we do not care about it, we only care about tnum, u/s/32/min/max. So the approach has been to "comment out" this memset from verifier.c before running the LLVMToSMT pass.
🤔 v6.9 is between 67420501e8 (v6.8-rc1131^2289^26^212) and 4c2a26fc80 (v6.10-rc1153^2432^2~8). Shouldn't it fail?
That's correct! It should fail, unless the kernel is patched with 4c2a26fc80. I patched the 6.9 kernel before running llvm-to-smt, and it succeeded in generating all encodings locally.
I was able to generate the encodings (not modular) for v6.8, v6.9, and bpf-next (b3470da314fd
), so I think we're looking good. Once https://github.com/bpfverif/agni/pull/29 is merged, I'll add encoding generation for bpf-next and linus to the CI.
It looks like we will need to support at least LLVM 13.0.1 to be able to run Agni against the latest Linux (here, Linus's tree).
Trying to run that make command:
Full run can be seen at https://github.com/pchaigno/agni/actions/runs/8772758147/job/24072064645?pr=3.