facebook / infer

A static analyzer for Java, C, C++, and Objective-C
http://fbinfer.com/
MIT License
14.83k stars 2k forks source link

Bufferoverrun Analysis - small fixes and improvements #1736

Open sjxer723 opened 1 year ago

sjxer723 commented 1 year ago

Hi, this pr contains some improvements and fixes to buffer overrun analysis. The two fixes were originally inspired due to the two false positives/negatives we found during using it.

  1. The buffer overrun checker ignores checking the case when the index may be +oo while the array size is less than +oo. It may cause a false negative. For example,
int a[1];
for(int i=0; a[i]; i++) {}

There will be a buffer overrun issue within the condition judgment of the for statement since i will finally be equal to 1 and the statement will check whether a[1] is nonzero.

  1. The buffer overrun checker makes a rough estimation when adding two upper bounds, i.e., when i1 <= max {1, i1}, i2 <= i2, BO only let i1 + i2 <= 1 + i2, while the true upper bound should be max{1 + i2, i1 + i2}. Hence it will cause a false positive when checking the codes below:
#define MAX_LEN (16)

void foo(const unsigned char *additional, size_t len, size_t nonce_len, size_t entropy_len)
{
        unsigned char seed[MAX_LEN];
        size_t        seedlen = 0;

        if(entropy_len > MAX_LEN) 
            return;
        if(nonce_len > MAX_LEN - entropy_len) 
            return;
        if(len > MAX_LEN - entropy_len - nonce_len) 
            return;

        seedlen +=  entropy_len;
        if(nonce_len != 0)
                seedlen += nonce_len;
        if(additional != NULL && len != 0){
                memcpy(seed + seedlen, additional, len);
                seedlen += len;
        }
}

int main()
{
        unsigned char arr[20] = {0};
        foo(arr, 8, 0, 5);
        return 0;
}

Before executing the fourth if statement, BO will regard the value of seedlen as max{16, entropy_len}. However, when executing the fourth if statement, by adding the two variables seedlen and nonce_len, BO makes a rough estimation as above description, and will regard the upper bound of seedlen as 16 + nonce_len. Hence, it will cause a false positive when executing the call to the memcpy function.

For the two issues above, I have made the following improvements.

  1. Within the checking function of array access, I let BO report a L3-level error when the upper bound of index is +oo and the upper bound of the array size is limited.
  2. I have implemented a more precise plus of bounds. It handles the case when one operator is c1 + max{x1, d1} and another is c2 + d2.
facebook-github-bot commented 1 year ago

@skcho has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.