nimble-code / Cobra

An interactive (fast) static source code analyzer
136 stars 30 forks source link

Pattern match slightly complicated malloc free example #53

Closed saimukund closed 2 years ago

saimukund commented 2 years ago

Hi, First of all, thanks for sharing this wonderful project with the community and also for the continuous improvements and rules library . I am using cobra to find out memory leaks in our product. I was able to catch a few leaks using simple query.

find . -name "*.c" | xargs cobra -pat '{ .* cmsMem_alloc ^cmsMem_free* }'

While this works for basic cases where alloc() and free() are in same scope, I have other scenarios for which could you please suggest the pattern/command that would work.

void fun()
{
    if((x = cmsMem_alloc()) == NULL) {
    return;
}

if(...) {
   .....
   cmsMem_free(x);
}
else {
   .... 
   // this is a macro which internally calls 
   //  cmsMem_free() and also assign NULL to x.
   CMSMEM_FREE_BUF_AND_NULL_PTR(x); 
}
    return;
}

It is possible some functions use either of the free implementations or both. So,

  1. I want to match the malloc and also check if either variant of free exists after it.
  2. It should search this pattern within a function scope and not block scope.
  3. Also, if I could additionally match with identifier x, it would be great. This also reduces false positives, in case there are multiple alloc, free statements within a function.

Do you think this pattern is correct without identifier? find . -name ".c" | xargs cobra -pat '{ . cmsMem_alloc ^(cmsMem_free|CMSMEM_FREE_BUF_AND_NULL_PTR)* }'

Can we add ident as well here? Kindly share if this is feasible?

nimble-code commented 2 years ago

re 1: you can use embedded regular expressions to match on part of a token text, as in: { . /alloc ^/free }

re 2: the above should get you any level of scope, including function scope, but if you want to limit it, you could attach a constraint, for instance as in: { <1> . /alloc ^/free } @1 (.curly == 1) which restrict the match on the opening curly brace to be at level 1, which is the start of a function body (or, of course, a structure/union etc, to sometimes spoil things a bit)

  1. you can use variable binding on the alloc call, as in x:@ident = /alloc and then later refer to the bound variable with :x for instance as in: /free ( :x ) but it gets trickier if you now want to negate that sequence, since it's no longer one token.

you can do the whole thing also more precisely with a script - but then it'll probably take a bit to get the hang of writing inline programs etc. it's worth figuring these things out though -- the inline progs are very powerful

saimukund commented 2 years ago

Thanks for your quick answer. I will try it out. Is the regular expression case insensitive here? Does it matches both free variants that I mentioned earlier.

  1. cmsMem_free (x)
  2. CMSMEM_FREE_BUF_AND_NULL_PTR (x) Can we use :/free*( :x ) here so that it matches both?
nimble-code commented 2 years ago

the regular expression is case-sensitive, so I guess you can make it something like /[Ff][Rr][Ee][Ee], or do two searches