NASA-SW-VnV / ikos

Static analyzer for C/C++ based on the theory of Abstract Interpretation.
Other
2.04k stars 152 forks source link

Improve the uninitialized variable analysis #99

Open arthaud opened 5 years ago

arthaud commented 5 years ago

The uninitialized variable analysis (uva) is currently disabled because it produces false positives on very simple programs.

For instance, see the following code:

#include <stdio.h>
#define N 10

int main() {
    int tab[N];
    for (size_t i=0; i<N; i++) {
        tab[i] = i;
    }
    printf("%d\n", tab[N-1]);
    return 0;
}

This triggers a false positive:

$ ikos -a=uva test.c
test.c: In function 'main':
test.c:9:5: warning: expression 'tab[9]' might be uninitialized
    printf("%d", tab[N-1]);
    ^

This code is perfectly safe, of course. It follows strict coding rules for critical software, but IKOS is not able to prove that it is safe because of the uninitialized variable analysis.

We tried to solve this but it appears to be hard to do with Abstract Interpretation, the theory used in IKOS. We would need to infer that the loop initializes the elements from 0 to i, so at the end of the loop, all the array is initialized. I found a few research publications about "array smashing" that could help but it is hard to implement.

arthaud commented 5 years ago

A simple idea would be to create two separate analyses:

We could then enable uva by default - it shouldn't trigger many false positives. Then with the implementation of #129 we could maybe enable uma without getting too many false positives.

arthaud commented 5 years ago

The uninitialized variable analysis (uva) now only checks for uninitialized variables, and is enabled by default.

The next step is to implement the uninitialized memory analysis (uma).

practicalswift commented 5 years ago

Wow that is great news! Thanks for all your excellent work on ikos!

peckto commented 1 year ago

Hi @arthaud I played a little bit with the uninitialized variable analysis and I'm wondering what to expect and what not. Does the uva support arrays? Should it be sound? Especially, I experience problems with loops.

In the following example, IKOS assumes, within the loop, the data array is fully initialized, which is not the case. Outside the loop, the uninitialized value is detected correctly. Why is this the case, when for example the buffer-overflow analyze can correctly reason about the index values for this loop? (buffer overflow with the loop is detected without problem).

#include <ikos/analyzer/intrinsic.h>

int main(int argc, char** argv) {
    int data[10];
    data[0] = 0;

    for(int i=0; i<10; i++)
    {
        printf("%d", data[i]);
        __ikos_print_values("data[i]", data[i]);
    }
    __ikos_print_values("data[1]", data[1]);

    return 0;
}

Result:

main3.c:10:9: __ikos_print_values("data[i]", data[(int64_t)i]):
    data[(int64_t)i] is in [-2147483648, 2147483647]
    data[(int64_t)i] is initialized
main3.c:12:5: __ikos_print_values("data[1]", data[1]):
    data[1] is in [-2147483648, 2147483647]
    data[1] is uninitialized

It seems like, that only the first loop iteration is used to decide if the whole array is initialized (i=0). When initializing not data[0] but data[2], IKOS finds the initialized value within the loop.

Additionally, uva findings (also within a loop) are reported as error, which stops any further analysis and e.g. a buffer-overflow would not be detected.

arthaud commented 1 year ago
main3.c:10:9: __ikos_print_values("data[i]", data[(int64_t)i]):
  data[(int64_t)i] is in [-2147483648, 2147483647]
  data[(int64_t)i] is initialized
main3.c:12:5: __ikos_print_values("data[1]", data[1]):
  data[1] is in [-2147483648, 2147483647]
  data[1] is uninitialized

Yeah this seems obviously wrong. I would need to spend more time to figure out why. I think @richardlford introduced some changes in https://github.com/NASA-SW-VnV/ikos/commit/7ad55594f9cc70a03748ba2c2cb45a197536ddbc which might be related to this (random guess).

For now, I would recommend disabling uva.

Additionally, uva findings (also within a loop) are reported as error, which stops any further analysis and e.g. a buffer-overflow would not be detected.

Yes, this is even worse than false positives. If you find an example, please provide it here.

peckto commented 1 year ago

How is it possible to disable the uva completely, so that it has no effect on the analysis? If I disable it via --analyses, it seems like it is only disabled in the report output. Here is an example where uva prevents a buffer-overflow to be detected:

#include <stdio.h>

int main(int argc, char** argv) {
    int data[10];

    for(int i=0; i<11; i++)
    {
        printf("%d", data[i]);
    }

    return 0;
}

When running $ ikos -a 'uva' main.c it detects the uninitialized variable. But when running $ ikos -a 'boa' main.c the buffer-overflow is still not detected:

The program is SAFE

# Results
main4.c: In function 'main':
main4.c:8:9: safe: safe memory access
        printf("%d", data[i]);
        ^
main4.c: In function 'main':
main4.c:8:22: safe: safe memory access
        printf("%d", data[i]);

Only when initializing the buffer with int data[10] = {0}; the buffer-overflow is detected.

arthaud commented 1 year ago

How is it possible to disable the uva completely, so that it has no effect on the analysis? If I disable it via --analyses, it seems like it is only disabled in the report output. Here is an example where uva prevents a buffer-overflow to be detected:

Yes I was thinking about -a. You are right, it only disables the report.

#include <stdio.h>

int main(int argc, char** argv) {
    int data[10];

    for(int i=0; i<11; i++)
    {
        printf("%d", data[i]);
    }

    return 0;
}

Thanks for the example. We should fix this.