facebook / infer

A static analyzer for Java, C, C++, and Objective-C
http://fbinfer.com/
MIT License
14.9k stars 2.01k forks source link

Upstream bufferoverrun changes #1629

Closed dhauzar closed 1 year ago

dhauzar commented 2 years ago

This pull request contains patches we wrote with the aim to make BO work on the SIL generated by the Infer Ada frontend (developed at AdaCore). Due to the size of the diff, we are interested in upstreaming as much as possible of those patches.

Focus on zero false-positive rate

The main goal was to provide false-positive free analysis (we have different, more heavy-weight analyses that aim at a low or even zero number of false-negatives). Thus, only messages where BO can prove that the problem occurs are reported. The consequence of that is that an imprecision cannot be a cause of a false positive while an unsoundness can. This can cause a clash between our use of BO and the use of BO at Facebook.

An example of this is the commit “[F] Fix join of an unknown set of locations.” fixing an unsoundness intended in the original design of BO. For us, the unsoundness was causing unacceptable false positives. Here is the test containing a comment (codetoanalyze/java/bufferoverrun/ArrayListTest.java:loop_on_unknown_iterator_FN) explaining that the unsoundness was intended in the original design:

  void loop_on_unknown_iterator_FN(MyI x, int j) {
    ArrayList<Integer> a = new ArrayList<>();
    ArrayList<Integer> b;
    if (unknown_bool) {
      b = a;
    } else {
      b = x.mk_unknown();
    }
    // `b` points to an zero-sized array and `Unknown` pointer.  Thus, the size of array list should
    // be evaluated to [0,+oo] in a sound design.  However, this would harm overall analysis
    // precision with introducing a lot of FPs.  To avoie that, we ignore the size of `Unknown`
    // array list here, instead we get some FNs.
    for (Integer i : b) {
      // Since size of `b` is evaluted to [0,0], here is unreachable.
      if (a.size() <= -1) {
        int[] c = new int[5];
        c[5] = 0;
      } else {
        int[] c = new int[10];
        c[10] = 0;
      }
    }
  }

If such changes are not acceptable, we would be interested in adding an option --sound-bufferoverrun. This would allow us to change the strategy used for the analysis on our side, while still reducing the size of our diff. (And hopefully make all this manageable on your side too.)

Notation

Each commit causing diffs in tests output is followed by commits updating test baselines - one commit for each group of tests (c_bufferoverrun, cpp_bufferoverrun, java_bufferoverrun, cpp_performance, and java_performance). Commit messages of commits updating baselines contain an analysis of diffs.

Each commit is marked by one of the following:

Each commit causing diffs in tests output is followed by commits updating test baselines - one commit for each group of tests (c_bufferoverrun, cpp_bufferoverrun, java_bufferoverrun, cpp_performance, and java_performance). Commit messages of commits updating baselines contain an analysis of diffs.

Each commit is marked by one of the following:

Overview of the impact

Here is the overview of the current state of the impact of the patches. The detailed information about individual diffs in the FB's tests is in the commit messages of commits updating test baselines.

Commits causing diffs:

[F] Fix join of unknown set of locations.

[F] Fix strong updates when updating multiple locations.

[F] Fix strong updates when handling subprogram calls.

[Top] Use top as a default for value of a location.

[F] Don't prune locations that can be only weakly updated.

[I] Don't store info about unkn. locations in abstr. state.

[I] Prune also stack variables.

[I] More precise handling of symbolic intervals.

[F] Fix type info for record fields in type environment.

[F] Fix setting the length of new array.

[Precond] Reimplement detection of reachability.

yakobowski commented 2 years ago

Note that we will of course fix the merge conflict(s) if you are interest in the patch.