goblint / bench

The benchmark suite
4 stars 5 forks source link

CPALockator's Linux device driver data race detection benchmarks #3

Open sim642 opened 3 years ago

sim642 commented 3 years ago

I recently read the following paper about CPALockator: Analysis of Correct Synchronization of Operating System Components. Basically CPALockator is a CPAchecker fork/branch/something that also does thread-modular analysis.

In the evaluation section, they mention having constructed a set of 425 data race detection benchmarks from Linux 4.2.6 device drivers. There's no mention of the availability of this benchmark set, but I managed to find this, which seems like it's the same set: https://gitlab.com/sosy-lab/software/ldv-benchmarks/-/tree/master/linux-4.2.6-races.

I wasn't aware of this earlier and I'm not sure why this didn't get added to sv-benchmarks for data-race. Nevertheless, this might be a useful set of benchmarks for us to look at because:

  1. There's 425 of them.
  2. They're from the Linux kernel (so should be sufficiently large).
  3. Since both CPALockator and Goblint are thread-modular, there's hope that we can also analyze them successfully.

The same repository also seems to have some data-race benchmarks for Linux 4.18: https://gitlab.com/sosy-lab/software/ldv-benchmarks/-/tree/master/linux-4.18-races. And a handful of commit-based data-race benchmarks in Linux (although most don't have before vs after like the README claims): https://gitlab.com/sosy-lab/software/ldv-benchmarks/-/tree/main/ldv-commits-races.

Some of the benchmarks are in Unknowns subdirectories but still contain task definitions with expected verdicts (not sure based on what). These should probably be excluded to be on the safe side.

sim642 commented 2 years ago

In https://github.com/goblint/analyzer/pull/618 I realized that these benchmarks have some problems and not immediately usable.

LDV-specific model functions

A README says:

Thread creation is modeled by a function pthread_create with default signature, which creates a single thread, or pthread_create_N, which creates several instances at once. Locking is organized by functions ldv_mutex_model_lock/ldv_mutex_model_unlock and ldv_spin_model_lock/ldv_spin_model_unlock.

The benchmarks don't define these LDV locking/unlocking functions themselves in terms of pthread, but rather assume the analyzer itself supports them. Similarly they use pthread_create_N (and a corresponding pthread_join_N) which are nonstandard and must be modeled as non-unique creations.

CPAchecker

The benchmarks are originally for CPAlockator, which is based on CPAchecker, which has a related configuration file: https://github.com/sosy-lab/cpachecker/blob/trunk/config/includes/lockator/linux.properties. As seen there, it adds the above mentioned locking functions (and a few which weren't mentioned) for special lock handling.

Regarding the special thread creation, that is hard-coded into CPAchecker itself: https://github.com/sosy-lab/cpachecker/blob/2cfbe22b515fe9fedbd7629c62e9b93ddeda0a9a/src/org/sosy_lab/cpachecker/cfa/postprocessing/function/ThreadCreateTransformer.java#L68-L73.

Moreover, it defines a bunch of skippedvariables for other LDV-specific functions that also don't have definitions.

Klever

Somehow Klever has been used to generate these benchmarks from Linux source code. It's very much not described anywhere, how it does the environment modeling etc, but I managed to find this: https://github.com/ldv-klever/klever/blob/master/presets/jobs/specifications/linux/concurrency%20safety/synchronization%20primitives.aspect. That somehow defines the transformations that are applied to original source code to replace them with LDV-specific functions instead.

Since Goblint has been used to analyze non-transformed Linux source code, it actually has a bunch of LibraryFunctions handling for the original names of the functions (such that locking, etc. is recognized), but not for the LDV-transformed names.

Incorrect Frama-C generation

All the benchmarks have the following header comment:

/* Generated by Frama-C */

I'm not sure what that exactly means, but my best guess is that the combining/merging mechanism from Frama-C has been used (a la cilly). Although it might also have involved something additional (simplifications?).

For example, linux-4.2.6-races/Unsafes/u__linux-concurrency_safety__drivers---net---irda---ksdazzle-sir.ko.cil.i contains in usb_endpoint_dir_in the following:

  __retres = (int)epd->bEndpointAddress < 0;

Crucially, bEndpointAddress is of unsigned integer type, so __retres is always false, which in turn makes most of the code dead. And that's not the only instance of such useless expression appearing.

Original Linux source

When looking for the same function in the source of that version of Linux, the code is completely different:

    return ((epd->bEndpointAddress & USB_ENDPOINT_DIR_MASK) == USB_DIR_IN);

These bit operations on an unsigned integer aren't trivially constant, so somewhere along the way the semantics have been changed.