Open RichardHoOoOo opened 1 year ago
The alias analysis is an IFDS analysis as well, i.e., it propagates aliases along the ICFG. It doesn't matter whether the same local is re-used at different locations, because the IFDS problem must decide whether to pass on an alias abstraction or not solely based upon the current statement.
The alias analysis stops when the variable is completely overwritten with a non-aliasing element, e.g., null
. In the general case, we pick up the right side, but since null
has no aliases, we just stop. I remember that we had similar handling for allocation sites, so back then, we stopped in line 2 in your example. The problem with this approach is when parameters are tainted. In that case, you start loosing aliases, because they no longer reach the start of the method where they can be propagated back into the caller.
Your example should not lead to a false positive, though. We have the activation statement in the abstraction. A taint is inactive (and thus cannot lead to a leak) when it has a non-null activation statement. The AS is the point where the alias analysis was started. Only when such an alias-derived taint passes through that statement again in the forward analysis, the AS is removed and the taint becomes active again and can lead to a leak.
Hi @StevenArzt , thanks for your detailed answer.
I just do a quick search in the repository and I found it seems alias analysis only stops when the right hand side is Constant or NewArrayExpr (I'm not sure).
If the right side is a constant, the variable is of a primitive type or a string anyway, so the variable cannot have aliases. Therefore, it's safe to abort.
The array case is a special case for tracking the length of an array, which is done once we have found the allocation site. I would need to look deeper into the code to see why this doesn't affect tracking array contents, though.
Hi @StevenArzt
In the following example, when
b
is tainted at line3, a backward alias analysis is triggered. I register a taint propagation handler and I can see that thea
returned infoo
is also tainted.Why the alias analysis does not stop at line2 (the new expression)? Is it because Jimple reuse the same name for
o
at line1 and theo
at line2, although they actually have different variable names from the source code level?