vbpf / ebpf-samples

Sample ebpf programs to analyze
91 stars 12 forks source link

Add sample illustrating issue with pointing to two stack ranges #21

Closed dthaler closed 3 years ago

dthaler commented 3 years ago

This manually illustrates the issue where this program currently fails verification. However this style of assembly is common apparently due to register spilling, and the Linux kernel verifier can verify this sort of program, so real programs in use today do this.

One way to possibly address this would be to add another per-register variable like r0.numeric_size which contains the size of space pointed to that is known to be a number. Care would need to be taken to update/havoc it for a stack type if that part of the stack is written to, or if the offset is updated. It wouldn't handle every possible case, but I believe it would handle the common case such as in the sample, and in typical cases of register spilling.

Signed-off-by: Dave Thaler dthaler@microsoft.com

elazarg commented 3 years ago

@caballa do you have any ideas?

This specific case, like others, would be solved by pushing assertions backwards. It does not solve every possible situation, but it is becoming more and more appealing to me.

caballa commented 3 years ago

I don't understand why we cannot verify this program.

The verifier should know that stack_buffer[0], stack_buffer[1], ..., stack_buffer[15] are initialized to some number originated from get_prandom_u32().

From the then-branch: ptr points to either &stack_buffer[0], ..., &stack_buffer[7] From the else-branch: ptr points to either &stack_buffer[8], ..., &stack_buffer[15]

So where is the imprecision coming from? We should know that rand32 % 8 is between [0,...7]. Is it line 28 (https://github.com/vbpf/ebpf-samples/pull/21/files#diff-a7998aaba5a95b813939c2f35c20ee96f1268336350abe4698bfd85751368ea7R28) the problem? why?

dthaler commented 3 years ago

I don't understand why we cannot verify this program.

The verifier should know that stack_buffer[0], stack_buffer[1], ..., stack_buffer[15] are initialized to some number originated from get_prandom_u32().

From the then-branch: ptr points to either &stack_buffer[0], ..., &stack_buffer[7] From the else-branch: ptr points to either &stack_buffer[8], ..., &stack_buffer[15]

We know it before the join. But the join combines stack_buffer[0] = top in one branch with stack_buffer[0] = number in the other branch, and removes it as going to top. Same for every byte in stack.

caballa commented 3 years ago

@dthaler : I got it now. Thanks!

Backward analysis would solve the problem in this case and I like it because it's systematic and it would also help with some other cases where some forward disjunctive invariant would be needed otherwise. I remember when we wrote the PLDI paper, at least one program that we couldn't prove was proven safe by running a backward analysis. However, the tricky part is when the number of instructions is large because the slowdown could be significant. Of course, we could limit the backward analysis to few basic blocks but then it wouldn't be systematic anymore and we would need to see if it's good enough for the programs people write.

Another partial solution would be to do some lookahead and predict which dereferences might be needed in the future. In our example, we could dereference ptr in each branch and then use it after. Of course, we need to invalidate the predictions as soon as some memory writes may overwritten them.

Do you know how the Linux verifier deals with cases like this? I think this is the first thing we should look at.

dthaler commented 3 years ago

@elazarg Regardless of whether the verifier passes or fails this sample, is there anything blocking merging this as another sample to test?