facebook / infer

A static analyzer for Java, C, C++, and Objective-C
http://fbinfer.com/
MIT License
14.83k stars 2k forks source link

Bufferoverrun analysis - small fixes and improvements #1735

Closed sjxer723 closed 1 year ago

sjxer723 commented 1 year ago

Hi, this pr contains some improvements and fixes to buffer overrun analysis. The two fixes were originally inspired due to the two false positives/negatives we found.

  1. The buffer overrun checker ignores checking the case when the index may be +oo while the array size is less than +oo. It may cause a false negative. For example,
int a[1];
for(int i=0; a[i]; i++) {}

There will be a buffer overrun issue within the condition judgment of the for statement since i will finally be equal to 1 and the statement will check whether a[1] is nonzero.

  1. The buffer overrun checker makes a rough estimation when adding two upper bounds, i.e., when i1 <= max {1, i1}, i2 <= i2, BO only let i1 + i2 <= 1 + i2, while the true upper bound should be max{1 + i2, i1 + i2}. Hence it will cause a false positive when checking the codes below:
#define MAX_LEN (16)

void foo(const unsigned char *additional, size_t len, size_t nonce_len, size_t entropy_len)
{
        unsigned char seed[MAX_LEN];
        size_t        seedlen = 0;

        if(entropy_len > MAX_LEN) 
            return;
        if(nonce_len > MAX_LEN - entropy_len) 
            return;
        if(len > MAX_LEN - entropy_len - nonce_len) 
            return;

        seedlen +=  entropy_len;
        if(nonce_len != 0)
                seedlen += nonce_len;
        if(additional != NULL && len != 0){
                memcpy(seed + seedlen, additional, len);
                seedlen += len;
        }
}

int main()
{
        unsigned char arr[20] = {0};
        foo(arr, 8, 0, 5);
        return 0;
}

Before executing the fourth if statement, BO will regard the value of seedlen as max{16, entropy_len}. However, when executing the fourth if statement, by adding the two variables seedlen and nonce_len, BO makes a rough estimation as above description, and will regard the upper bound of seedlen as 16 + nonce_len. Hence, it will cause a false positive when executing the call to the memcpy function.

For the two issues above, I have made the following improvements.

  1. Within the checking function of array access, I let BO report a L3-level error when the upper bound of index is +oo and the upper bound of the array size is limited.
  2. I have implemented a more precise plus of bounds. It handles the case when one operator is c1 + max{x1, d1} and another is c2 + d2.
facebook-github-bot commented 1 year ago

Hi @sjxer723!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!