Open dvyukov opened 7 years ago
One other idea is testing the behaviour of a new kernel against behaviour of an old kernel.
There would need to be a way to identify or mark behaviour that is expected to change. Alternatively, we could restrict checking behaviour of kernel APIs that are guaranteed to be stable and should not change.
There are still interesting problems related how to minimize non-determinism of the result of a randomly generated test program. Some ideas:
FWIW Dirty Pipe could have been found by file honeypots.
Inspired by /sys/kernel/notes story: We could scan all kernel outputs (e.g. what we read from files, receive over sockets, etc) for potential kernel .text/.data addresses. Will likely have false positives and will need some tuning, but may be interesting to do as an experiment just to see if it catches something real.
executor could detect a set of logical bugs in kernel on top of basic safety bugs detected by grepping console output. To the best of my knowledge logical kernel bugs are not detect by any other automated testing systems, so this could give us whole new plast of bugs. Examples of such checks:
Taking into account complexity of kernel model these checks probably should be very conservative (give up predicting outcome when in doubt). But still it would be interesting to see if we can detect at least some bugs with conservative checks.
Complete model is close to impossible, so we need to aggressively limit scope initially and then incrementally extend it. This includes:
Once we get some initial working base, we can start extending it in all directions. Obviously, more syscalls. But also maybe some limited concurrency, e.g. white/black-list of syscalls that can run concurrently (e.g. no close/write). Ultimately, it may be extremely interesting to test that 2 concurrent syscalls are atomic, i.e. result is equal either to one syscall executed first, then another, or vise versa (potential example bugs: 1, 2). This will probably need some blacklist too. But on the other hand, it does not require the second implementation. E.g. concurrent read/write on udp socket should always be atomic.
A related work on checking filesystems against POSIX model: SibylFS: formal specification and oracle-based testing for POSIX and real-world file systems.
A related idea that may be simpler to implement is to arrange some honeypots for the test process and then checking if the process is caught red-handed at these honeypots. Examples of honeypots:
Another idea from Kit: Testing OS-level Virtualization for Functional Interference Bugs paper to detect "functional interference bugs in OS-virtualization mechanisms, such as Linux namespaces. The key idea of Kit is to detect inter-container functional interference by comparing the system call traces of a container across two executions, where it runs with and without the preceding execution of another container".