Checking Encodings in CI

pchaigno commented 2 months ago

Our current CI workflows for LLVM-to-SMT and bpf-verification run the scripts, but they don't actual check the results besides the return code. Ideally, we should check the results to detect regressions other than "the script errors out".

A while ago we discussed several ideas to do this for LLVM-to-SMT. IIRC, one idea was to use the solver to run sanity checks on the encodings (?). Another idea was to simply compare a set of reference encodings with the generated encodings using git diff.

That second idea worked on the assumption that the encodings would remain exactly the same if generated on the same arch. So for example, to check encodings in GitHub, we would need the reference encodings to be generated on GitHub as well. I tried this and unfortunately, the assumption doesn't hold. I used https://github.com/bpfverif/agni/pull/34 to generate all encodings I needed on GitHub and https://github.com/bpfverif/agni/pull/33 to compare that reference with newly-generated encodings. The two sets look completely different to git diff :disappointed:

So it looks like we'll need a new idea to handle this in the long term.

harishankarv commented 2 months ago

The encodings should remain exactly the same if (a) we use the same llvm version (b) on the same kernel commit (c) the encodings were generated on the same architecture, and (d) the encodings use the same bitvector suffix. (d) is requires some explanation, and this is what I believe the issue is, so please read below.

For every BPF_* instruction that we want to obtain encodings for, generate_encodings.py adds a unique suffix. to every bitvector in the formula

For example, in the run for 33, all the bitvectors for the a/5.10/BPF_AND.smt2 use the suffix of _3, whereas all the bitvectors for b/5.10/BPF_AND.smt2 use the suffix of _0. For example, a compressed diff:

-(declare-fun cond36.i.i.i.i_3_184 () (_ BitVec 32))
+(declare-fun cond36.i.i.i.i_0_184 () (_ BitVec 32))

Above, the only difference is the _3 vs. _0.

When generate_encodings.py generates all the encodings together, that is, when --specific-op is not specified, it will start assigning suffixes to each op from 0 through 35 (for the 36 ops). Here, it seems a/5.10/BPF_AND.smt2 got assigned the bitvector suffix 3. The other ops would have got assigned suffixes in order.

If specific-op is indeeded specified, the suffix always will be 0. It seems b/5.10/BPF_AND.smt2 was generated using --specific-op, that's why the suffix is 0.

So, to compare encodings via git diff, one approach could be to, all encodings should be generated using --specific-op, to ensure they have the same bitvector suffix as well, i.e. 0. Alternatively, one can generate all the encodings without specific-op, but they should be compared against encodings that were also generated without --specific-op.

harishankarv commented 2 months ago

To give some more context, each op needs to have different suffixes for their bitvectors, because this is important in the synthesis step of bpf_alu_jmp_synthesis.py. The encodings of BPF_AND and BPF_SUB could both contain an llvm variable called cond36, but if they are both converted to a bitvector named cond36, that would be a problem in a formula that contains the encodings for both BPF_AND and BPF_SUB (which is the case in synthesis). So, we need this suffix, to disambiguate the bitvectors in encodings of all the ops for a particular kernel version.

pchaigno commented 2 months ago

Ahah! Thanks for the great explanation! I will try to regenerate them with --specific-op.

To give some more context, each op needs to have different suffixes for their bitvectors, because this is important in the synthesis step of bpf_alu_jmp_synthesis.py. The encodings of BPF_AND and BPF_SUB could both contain an llvm variable called cond36, but if they are both converted to a bitvector named cond36, that would be a problem in a formula that contains the encodings for both BPF_AND and BPF_SUB (which is the case in synthesis). So, we need this suffix, to disambiguate the bitvectors in encodings of all the ops for a particular kernel version.

So that also means that if I generate with --specific-op then I can't run the synthesis (assuming it gets to that point) on multiple ops at once? I'm a bit surprised the encodings of several ops may be mixed in a single formula; I thought each ops was verified to completion independently (which allowed us to parallelize).

harishankarv commented 2 months ago

So that also means that if I generate with --specific-op then I can't run the synthesis (assuming it gets to that point) on multiple ops at once?

Yes, that's correct. Any formula that uses multiple op encodings, needs to make sure that the bitvectors in the ops themselves are unique. That was the reason for adding the suffix in the first place. Synthesis (BPF*; BPF*), and SRO verification (BPFSYNC; BPF*) use multiple ops, these steps need BPF_* ops with unique bitvector suffixes.

To clarify, generate_encodings.py step is the one that adds the bitvectors suffixes to the encodings (i.e. BPF_OP.smt2) of the operators. After the generate_encodings step without specific-op, all encodings should be available with unique suffixes for each BPF_OP. Then, bpf_alu_jmp_synthesis.py can indeed be run in parallel.

harishankarv commented 2 months ago

I'm thinking it will just be better to assign a fixed suffix to all operators, for clarity. That is, BPF_AND will always use a suffix of 0, BPF_OR of 1, and so on. The suffixes will be the same whether we use --specific-op, or not.

pchaigno commented 2 months ago

It works now: https://github.com/bpfverif/agni/actions/runs/10455258443/job/28949794898?pr=33 :rocket:

I'll clean up the pull request and we should soon have a check for the encodings in CI. After that, I can work on the check for the BPF verification itself, but that's easier :smiley:

bpfverif / agni

Checking Encodings in CI #48