facebook / infer

A static analyzer for Java, C, C++, and Objective-C
http://fbinfer.com/
MIT License
14.8k stars 2k forks source link

Uninitialized false positive (normalization issue in PulseFormula?) #1750

Open danielmercier opened 1 year ago

danielmercier commented 1 year ago

In the following cpp reproducer:

#include <cassert>

void check_range(bool range_var_valid, int &range_var) {
  if (range_var_valid) {
    range_var = range_var + 1;
  }

  assert(range_var_valid >= 0 && range_var_valid <= 1);
}

void do_stuff(bool range_var_valid, int &range_var) {
  check_range(range_var_valid, range_var);
}

int main() {
  bool range_var_valid = false;
  int range_var;
  do_stuff(range_var_valid, range_var);
  return 0;
}

Pulse reports an uninitialized value when calling do_stuff for range_var. However, in case range_var_valid is false, range_var is not read.

Here is the summary for do_stuff (when range_var_valid is true):

conditions: {0 ≤ [v3]}∧{[a2] ≤ 1}∧{[a2] ≠ 0}
    phi: var_eqs: a2=v3
         && linear_eqs: a2 = -a4 +1 ∧ v5 = v6 -1
         && term_eqs: [v6 -1]=v5∧[-a4 +1]=a2
         && tableau: a2 = -a4 +1
         && intervals: a2=1
         && atoms: {is_int([v5]) = 1}∧{is_int([v6]) = 1}∧{is_int([a2]) = 1}
                   ∧{[a2] ≠ 0}
    { roots={ &range_var_valid=v1, &range_var=v2 };
      mem  ={ v1 -> { * -> v3 }, v2 -> { * -> v4 }, v4 -> { * -> v6 } };
      attrs={ v4 -> { WrittenTo (3, ) } };}
    PRE=[{ roots={ &range_var_valid=v1, &range_var=v2 };
           mem  ={ v1 -> { * -> v3 }, v2 -> { * -> v4 }, v4 -> { * -> v5 } };
           attrs={ v1 -> { MustBeInitialized(, t=0),
                           MustBeValid(, None, t=0) },
                   v2 -> { MustBeInitialized(, t=0),
                           MustBeValid(, None, t=0) },
                   v3 -> { UsedAsBranchCond(check_range, line 4, column 7, ) },
                   v4 -> { MustBeInitialized(, t=3),
                           MustBeValid(, None, t=3) } };}]
    need_specialization=false
    skipped_calls={ }
    Topl={len=0;content=
           [  ]}

Let's first zoom on the conditions part:

conditions: {0 ≤ [v3]}∧{[a2] ≤ 1}∧{[a2] ≠ 0}
var_eqs: a2=v3

I was a bit surprised to see that "conditions" is not normalized according to var_eqs. Unfortunately, changing this does not fix the issue completely as the mem part still contains v3 and not a2:

PRE=[
    { roots={ &range_var_valid=v1 };
      mem  = { v1 -> { * -> v3 }
    }
]

Thus, in addition to the previous change, modifying the canonicalization on pre to take into account var_eqs fixes the issue.

However, I am not sure this is the correct way of fixing this issue. When removing the read to range_var, the contradiction is seen later on, when applying the post state. So here Pulse is correct in the sense that only one state remains (range_var_valid = true is correctly discarded). But since Pulse checks the validity/initialization of abstract values before applying the post, the error is recorded before contradiction is discovered.

I will propose a patch with the two changes I described, but I am also very interested about your though on the second part.