Open Alan-Jowett opened 3 years ago
I believe this can (and should) be done in a generic fashion that would also support things like:
and any other such pairings that may exist now or in the future.
So the generic invariant to maintain would be, for helper IDs X, Y, and Z:
Only programs that meet such invariants are safe.
The spin lock case would have X = bpf_spin_lock, Y = bpf_spin_unlock, Z = 0 The ringbuf case would have X = bpf_ringbuf_reserve, Y = bpf_ringbuf_submit, Z = bpf_ringbuf_discard
So when a platform provides helpers, we would need a way of looking up the pairings. For example, it could be a platform function like we do for is_helper_usable()
.
And we'd need a way to maintain in the ebpf domain the number of calls (if non-zero) to any of those IDs.
In some cases it might be possible for the runtime to do all these checks every time an eBPF program returns from a hook call, but doing that on every call would harm performance, whereas having the verifier do it is both more performant and matches what Linux does as noted in the man pages for the helpers mentioned above.
I don't think this invariant is sufficient. For example, the following program is illegal but (IIUC) the invariant holds:
acquire(A)
acquire(A)
release(B)
release(B)
exit
Also, even if we track that nothing is acquired more than once, we also need to track everything that is acquired; in a naïve implementation, join operation may forget what is tracked. In particular, the TOP state should represent "everything is acquired", and upon exit we need to somehow enumerate "everything" and make sure it is released.
The lock case seems simple in that respect, since only one lock is allowed. I'm not sure about the ringbuf case.
Correction: the TOP state should represent irrecoverable state. Maybe something like "everything is released infinitely many times".
I think the spin_lock case of only-one acquire is easily handled by the runtime, especially given the man page says:
The BPF program can take ONE lock at a time, since taking two or more could cause dead locks.
So I think the verifier need not be enlightened with that sort of knowledge, unless we really wan to add it to the 'strict' checks.
The ringbuffer case has no such note in the man page so I think the sequence you show would be legal there.
I think this requires backward iteration, not the forward iteration we use (so the state starts with "nothing will be released" and we collect information about what is certain to be released). Joins (at branches, since we're going backwards) are intersections of what we know to be released in each path.
@caballa do you have insights on this?
I was thinking it could be done with simple precondition/postcondition invariants like we use now for other things. Not sure why that wouldn't work with existing iteration.
If I replace "# calls" in my proposed invariant above with "# unmatched calls", i.e., reset to 0 on equality of the invariant, then any (forward direction) joins can be made to fail verification unless both branches have exactly the same invariant values.
I don't say your proposal is impossible, but I think it is somewhat outside of how abstract interpretation usually works. joins do not usually contain assertions (which is roughly like jumping to BOT - and joins are supposed to go towards TOP, not bottom). So it might be possible to do it that way, but it's far from obvious to me what the implications are.
I think it can be made equivalent to backwards-must analysis when the domain is finite, if we can treat bpf_ringbuf_reserve as possible only on a predefined number of values.
I think you are talking about typestate verification. Properties such as lock/unlock, open file/close file fits into that. You can think of a typestate property as a property that can be modeled as a finite automata where states are opened
, closed
, acquired
, etc and then transitions happen when system calls such as acquire
, open
, etc
There is a well known model-checker called SLAM (developed by MSR) that focuses on this kind of properties. The model-checker was specially useful to prove these properties in Windows device drivers. See e.g., http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.376.691&rep=rep1&type=pdf. This article is very high-level but there are many more technical papers.
There are two ways to do that in Prevail. One is by doing an instrumentation where the typestate properties are reduced to counters so we can use Prevails as it's. This is what SLAM does. The other is to have an abstract domain for finite lattices. In Crab, we have that (see e.g., https://github.com/seahorn/crab/blob/master/include/crab/domains/discrete_domains.hpp). I don't think you need to be backwards for lock/unlock and similar properties because we need to reason about the past and not the future
An interesting paper to read is this
H. Aït-Kaci, R. Boyer, P. Lincoln, R. Nasr. Efficient implementation of lattice operations. In ACM Transactions on Programming Languages and Systems (TOPLAS), Volume 11, Issue 1, Jan. 1989, pages 115-146.
This paper shows show we can encode finite lattices into 0-1 matrices so that then operations such as join/meet, etc are very efficient. I didn't implement it in Crab because I didn't have the need to be that efficient but this is for instance, what Sparta (Facebook static analyzer) uses the last time I checked
I think there's an issue with tracking the result of bpf_ringbuf_reserve()
: it may return NULL, and the buffer is only acquired if it is not NULL, so simply using another set domain tracking acquired resources (e.g. by allocation site, disallowing allocating from the same site be freeing) will not work. A tri-state version using an intermediate maybe allocated
state might work, but I am not sure about the details.
The property is still a typestate property. It just requires to be combined with an abstract domain that can reason about nullity.
BPF_MAP_TYPE_RINGBUF exposes the following helper functions: 1) bpf_ringbuf_reserve 2) bpf_ringbuf_commit 3) bpf_ringbuf_discard
The semantics are that bpf_ringbuf_reserve reserves a record in the ring buffer and the BPF program is required to call either bpf_ringbuf_commit or bpf_ringbuf_discard on the record prior to the end of the BPF program.
There is a workaround that can be employed to detect leaked reserved records, but it's expensive and slow, so it would be preferred if the verifier could catch it.