llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.87k stars 11.93k forks source link

[false positive] [clang-analyzer-core.uninitialized.Assign] #54377

Open RalphS69 opened 2 years ago

RalphS69 commented 2 years ago

With clang-tidy (LLVM version 13.0.0). , this code snipped

#include <assert.h>
#include <stdint.h>
#include <stdbool.h>

#define SIZE 2

void func( int16_t len1, int16_t len2 )
{
    int16_t i;
    float t[SIZE];

    assert( len1 > 0 );
    assert( len1 <= SIZE );

    assert( len1 % 2 == false );

    for ( i = 0; i < len1/2; i++ )
    {
        t[i] = 1;
    }

    for (      ; i < len1; i++ )
    {
        t[i] = 1;
    }

    assert( len2 <= len1 );

    for ( i = 0; i < len2; i++ )
    {
        float s = t[i];
        (void) s;
    }
}

causes

warning: Assigned value is garbage or undefined [clang-analyzer-core.uninitialized.Assign]
        float s = t[i];
        ^         ~~~~
note: Assuming 'len1' is > 0
    assert( len1 > 0 );
            ^
note: Taking true branch
    assert( len1 > 0 );
    ^
note: Assuming 'len1' is <= 2
    assert( len1 <= SIZE );
            ^
note: Taking true branch
    assert( len1 <= SIZE );
    ^
note: Assuming the condition is true
    assert( len1 % 2 == false );
            ^
note: Taking true branch
    assert( len1 % 2 == false );
    ^
note: Assuming the condition is true
    for ( i = 0; i < len1/2; i++ )
                 ^~~~~~~~~~
note: Loop condition is true.  Entering loop body
    for ( i = 0; i < len1/2; i++ )
    ^
note: Assuming the condition is true
    for ( i = 0; i < len1/2; i++ )
                 ^~~~~~~~~~
note: Loop condition is true.  Entering loop body
    for ( i = 0; i < len1/2; i++ )
    ^
note: Assuming the condition is false
    for ( i = 0; i < len1/2; i++ )
                 ^~~~~~~~~~
note: Loop condition is false. Execution continues on line 22
    for ( i = 0; i < len1/2; i++ )
    ^
note: 'i' is >= 'len1'
    for (      ; i < len1; i++ )
                 ^
note: Loop condition is false. Execution continues on line 27
    for (      ; i < len1; i++ )
    ^
note: Assuming 'len2' is <= 'len1'
    assert( len2 <= len1 );
            ^
note: Taking true branch
    assert( len2 <= len1 );
    ^
note: Assuming 'i' is < 'len2'
    for ( i = 0; i < len2; i++ )
                 ^~~~~~~~
note: Loop condition is true.  Entering loop body
    for ( i = 0; i < len2; i++ )
    ^
note: Assuming 'i' is < 'len2'
    for ( i = 0; i < len2; i++ )
                 ^~~~~~~~
note: Loop condition is true.  Entering loop body
    for ( i = 0; i < len2; i++ )
    ^
note: The value 2 is assigned to 'i'
    for ( i = 0; i < len2; i++ )
                           ^~~
note: Assuming 'i' is < 'len2'
    for ( i = 0; i < len2; i++ )
                 ^~~~~~~~
note: Loop condition is true.  Entering loop body
    for ( i = 0; i < len2; i++ )
    ^
note: Assigned value is garbage or undefined
        float s = t[i];

If I am not mistaken, the assertions make len1=2 the only valid value (larger 0, smaller or equal 2, even). Why the hell is it then assumed, that the first loop is entered twice, whereas the second loop is not entered at all, if the first loop stops at len1/2 =1? Iianm, both loops should be passed the same number of times, as long as len1 is even, or not?

Not to mention, that len2 must not be larger than len1, making it impossible that the third loop is entered with i = 2 ...

llvmbot commented 2 years ago

@llvm/issue-subscribers-clang-static-analyzer

RalphS69 commented 1 year ago

Any news on this? It's already more than a year now, that I reported this.

I still run into issues with clang-tidy, but hesitate to report those, it they are just collected but not prepossessed ...

danix800 commented 1 year ago

I took a look and found that range constraint could be improved on rem % with specific ranges for this particular issue.

Can we add constraint like the following?

    a % m == 0 && a in (x, x + m]
implies that
    a = (x + m) / m * m, which is a 'ConcreteInt'
where x and m is ConcreteInt

Checked by z3:

(declare-const a Int)
(declare-const m Int)
(declare-const x Int)
(assert (= (mod a m) 0))
(assert (and (>= a (+ x 1)) (<= a (+ x m))))
(assert (not (= a (* (div (+ x m) m) m))))
(check-sat)
; unsat

@haoNoQ @steakhal @DonatNagyE

steakhal commented 1 year ago

@danix800 could you please post a minimal repro? The proof seems solid. This exactly what we usually need to constraint manager patterns. Does it also work with bitvectors?

danix800 commented 1 year ago

@danix800 could you please post a minimal repro?

Did you mean by minimal reproducible testcase to this issue? Or something else?

Does it also work with bitvectors?

I'm not familiar with bitvectors, could you elaborate a bit more on this?

Maybe I can proivde a PR as POC so we can have a better discussion context.

Also when I'm preparing the POC I find the original agument should be updated as following:

    a % m == 0 && a in [x, y] && y - x < m
implies that
    a = y / m * m, which is a 'ConcreteInt'
where x and m is ConcreteInt

Check with z3:

(declare-const a Int)
(declare-const m Int)
(declare-const x Int)
(declare-const y Int)
(assert (= (mod a m) 0))
(assert (and (>= a x) (<= a y)))
(assert (< (- y x) m))
(assert (not (= a (* (div y m) m))))
(check-sat)
; unsat
steakhal commented 1 year ago

@danix800 could you please post a minimal repro?

Did you mean by minimal reproducible testcase to this issue? Or something else?

I assume, you suggest this constraint deduction rule because it would resolve the FPs presented in he issue. If that's the case, we could have an even shorter reproducer which has the same properties: raises a FP without the proposed constraint pattern deduction, and would not raise with it. Its basically what I'd expect from the test of the patch implementing this deduction pattern.

Does it also work with bitvectors?

I'm not familiar with bitvectors, could you elaborate a bit more on this?

Well, me neither. But Int is an integer in the mathematical sense, right? And not a modulo arithmetic ring, like actual int32 etc. are. Consequently, a theorem might hold for math, but not hold for bitvectors of a fixed size with modulo arithmetic (aka. wrapping).

Maybe I can proivde a PR as POC so we can have a better discussion context.

Also when I'm preparing the POC I find the original agument should be updated as following:

Yes, the PoC would help the discussion, but I'm not sure there is consensus that it would worth the time to do it. To me, it feels like a pretty nieche case TBH, by only looking at how its formalized. Also remember that constraint manager is a core infrastructure in CSA. One mistake there will affect everything, thus requires careful review and even more careful evaluation. Given that I think this is a rare case, I don't expect much report diff, thus I could take the time evaluating the effect of the PoC once we are satisfied with it.

Usually, users could put an assert in the code stating their invariant that was missed by the analyzer. This way at the expense of that assertion they might be able to sidestep an analyzer shortcoming. This isn't ideal, but creating a perfect solver is not just impractical, but also impossible, thus we will need to draw the line at some point anyway. This is how I look at this. I'm sorry.

This standpoint should not discourage you or anyone fixing issues, even if the are edge cases. If anything, it will give experience in the relevant component, and we anyway have a shortage of contributors.

danix800 commented 1 year ago

The testcase is simple:

void remainder_within_modulo_positive_range(int x) {
  if (x <= 2 || x > 6)
    return;
  clang_analyzer_dump(x); // expected-warning{{reg_$0<int x>}}
  if (x % 5 != 0)
    return;
  clang_analyzer_dump(x);          // expected-warning{{5 S32b}}
  (void)x; // keep the constraints alive.
}