llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.32k stars 12.12k forks source link

bpf: verifier fails, because clang generates a 32-bit zero extension (<<=32, >>=32) on pkt_end pointer when branching is too complex #61874

Open gentoo-root opened 1 year ago

gentoo-root commented 1 year ago

A standard primitive in network BPF programs is checking that data + expected_length <= data_end, bailing out if this condition doesn't hold. For example, revalidate_data in Cilium checks that the packet is long enough to contain L2 and L3 headers.

The issue happens when revalidate_data is called multiple times within a complex graph of branching. Attached is the minimal example of reproduction. Every part seems to be important: putting the body of the inline function lb4_extract_tuple directly into handle_ipv4 produces good bytecode, removing seemingly useless conditions that never happen (ret == TC_ACT_SHOT or ret == TC_ACT_REDIRECT) also produces good bytecode, but the whole thing together produces buggy bytecode.

Compilation and testing command line:

$ clang -O2 -g -target bpf -std=gnu99 -nostdinc -Wall -Wextra -Werror -c custom.c -o custom.o
# bpftool prog load custom.o /sys/fs/bpf/test type sk_skb
# rm -v /sys/fs/bpf/test

Verifier output:

libbpf: prog 'tail_handle_ipv4_from_netdev': BPF program load failed: Permission denied
libbpf: prog 'tail_handle_ipv4_from_netdev': -- BEGIN PROG LOAD LOG --
0: R1=ctx(off=0,imm=0) R10=fp0
; int tail_handle_ipv4_from_netdev(struct __sk_buff *ctx)
0: (bf) r6 = r1                       ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0)
1: (18) r0 = 0xffffff7a               ; R0_w=4294967162
; return (void *)(unsigned long)ctx->data_end;
3: (61) r2 = *(u32 *)(r6 +80)         ; R2_w=pkt_end(off=0,imm=0) R6_w=ctx(off=0,imm=0)
; return (void *)(unsigned long)ctx->data;
4: (61) r1 = *(u32 *)(r6 +76)         ; R1_w=pkt(off=0,r=0,imm=0) R6_w=ctx(off=0,imm=0)
; if (data + tot_len > data_end)
5: (bf) r3 = r1                       ; R1_w=pkt(off=0,r=0,imm=0) R3_w=pkt(off=0,r=0,imm=0)
6: (07) r3 += 34                      ; R3_w=pkt(off=34,r=0,imm=0)
; if (data + tot_len > data_end)
7: (2d) if r3 > r2 goto pc+49         ; R2_w=pkt_end(off=0,imm=0) R3_w=pkt(off=34,r=34,imm=0)
; if (!(ctx->tc_index & TC_INDEX_F_SKIP_NODEPORT)) {
8: (61) r4 = *(u32 *)(r6 +44)         ; R4_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=ctx(off=0,imm=0)
; if (!(ctx->tc_index & TC_INDEX_F_SKIP_NODEPORT)) {
9: (57) r4 &= 4                       ; R4_w=scalar(umax=4,var_off=(0x0; 0x4))
10: (bf) r3 = r1                      ; R1_w=pkt(off=0,r=34,imm=0) R3_w=pkt(off=0,r=34,imm=0)
; if (!(ctx->tc_index & TC_INDEX_F_SKIP_NODEPORT)) {
11: (55) if r4 != 0x0 goto pc+35 47: R0_w=4294967162 R1_w=pkt(off=0,r=34,imm=0) R2_w=pkt_end(off=0,imm=0) R3_w=pkt(off=0,r=34,imm=0) R4_w=scalar(umax=4,var_off=(0x0; 0x4)) R6_w=ctx(off=0,imm=0) R10=fp0
; if (data + tot_len > data_end)
47: (07) r3 += 34                     ; R3_w=pkt(off=34,r=34,imm=0)
; 
48: (2d) if r3 > r2 goto pc+8         ; R2_w=pkt_end(off=0,imm=0) R3_w=pkt(off=34,r=34,imm=0)
; return (void *)(unsigned long)ctx->data_end;
49: (67) r2 <<= 32
R2 pointer arithmetic on pkt_end prohibited
processed 45 insns (limit 1000000) max_states_per_insn 1 total_states 4 peak_states 4 mark_read 2
-- END PROG LOAD LOG --
libbpf: prog 'tail_handle_ipv4_from_netdev': failed to load: -13
libbpf: failed to load object '/home/max/projects/cilium/cilium/bpf/tests/custom.o'
Error: failed to load object file

As can be observed with Godbolt, Clang generated the following transformation on pkt and pkt_end pointers (ctx->data and ctx->data_end) for one of revalidate_data calls:

        r2 <<= 32
        r2 >>= 32
        r1 <<= 32
        r1 >>= 32

The other revalidate_data calls don't have this zero-extension sequence, but the one that produces it fails to pass the verifier.

A similar issue has been reported recently (#59908, where this pattern of bitshifts was produced under other circumstances), but my case is more severe, because these useless instructions also make the verifier reject the program.

gentoo-root commented 1 year ago

@yonghong-song, could you take a look please?