Open SebastianKonplan opened 2 months ago
At first glance this looks similar to #94526
In trunk: https://godbolt.org/z/bd6nrafKn
@llvm/issue-subscribers-clang-static-analyzer
Author: Sebastian Priebe (SebastianKonplan)
@steakhal
Is it good issue to get familiar with CSA internals? If so, I'd really like to take it.
@steakhal
Is it good issue to get familiar with CSA internals? If so, I'd really like to take it.
Go ahead. I can see a reinterpret cast there, and those can be quite tricky at best. But you could as well have some luck. And somrting easy to fix.
Feel free to have a look, and as far as I'm concerned, I'll review PRs, and first come first served for landing PRs. If womeone conflicts with an issue, they should collaborate. So im not really for assigning issues, but you can work on it. Let me know if you have questions.
Got it, thanks!
That's indeed tricky one. I did some analysis, but don't know what would be the correct solution...
So, the bug happens here:
if (T.isNull()) {
if (const auto *TVR = dyn_cast<TypedValueRegion>(MR))
T = TVR->getValueType();
else if (const auto *TR = dyn_cast<TypedRegion>(MR))
T = TR->getLocationType()->getPointeeType();
else if (const auto *SR = dyn_cast<SymbolicRegion>(MR))
T = SR->getPointeeStaticType();
}
assert(!T.isNull() && "Unable to auto-detect binding type!"); <-- assert
Code comes here from processPointerEscapedOnBind()
on calling fun2()
. MR
is AllocaRegion
,
which has no type.
I found out that if we change cast type to char *
instead of void *
, then problem disappears, since
in that case MR
would be TypedValueRegion
(https://godbolt.org/z/W9j4Ys4oz). This happens because Loc
for DeclRefExpr
points to TypedValueRegion
at the point of ExprEngine::VisitCast
, while in case of void *
it points to AllocaRegion
.
I did not find where that binding happens. For some reason bp on RegionStoreManager::{bind, Bind}
do not work, so
I guess, I am just lost.
One solution I came up with is to pass QualType
from processPointerEscapedOnBind()
, since we know it
at call site. That way QualType
would be char *
and problem disappears. (See https://github.com/pskrgag/llvm-project/tree/csaAllocaCrash). I am not completely sure this is right solution and it won't just hide bigger problem.
Maybe completely right solution would be to make AllocaRegion
a TypedValueRegion
?
AllocaRegion sucks. But the current direction is to get rid of typed memregions, so making allocaregion typed would defeat that goal. The idea behing having that untyped is that at the allocation (at the alloca) we dont know what type will actually live at that storage. It just gives you a blob of memory.
I havent read the full context here, but making allocaregion typed is likely not the way to go. Tomorrow i plan to come back to this to be a bit more constructive :)
Btw @SebastianKonplan, I'm just curious, how did you find this code? Did you reduce it from some real-world codebase or its the outcome of some fuzzing effort? The answer may affect how much effort we decide to invest here. To me, it feels like pretty rare to hit in real code. Once you do, ofc, that feels bad :D
Oh, it's the infamous Case (4)
in processPointerEscapedOnBind()
. We're at fun2()
trying to escape values stored in buf
after an overwrite by the conjured symbol. In order to do that, we see if the bind even succeeds (can be represented by the Store at all), by trying to put it there and load it back. It's super hacky. Ultimately we should find a better way to do this. At a glance, I'm happy to skip this step entirely in case of AllocaRegion, and blindly assume that it escapes. (It's somewhat likely that AllocaRegion is exactly why this step exists in the first place, so it may become dead code if we skip it this way. But I'm ok with that too, unless we quickly find a way to eliminate the source of confusion.)
Side note, I think the reason why the Case (4)
related check was implemented in such a hacky way, was because the MemRegion
hierarchy was considered to be an implementation detail of RegionStore
. It was assumed to be undesirable for the rest of the engine (let alone the checkers) to implement custom behaviors for the specific subclasses of MemRegion
; instead, they intended the high-level interface of getSVal()
to be sufficient for all purposes.
I think this view largely didn't stand the test of time. The class hierarchy is quite useful for that purpose; it's small and self-contained and very expressive as it gives you exactly the answers that you're looking for.
In this case we can try to keep this abstraction from leaking by defining another virtual method, eg. Store::isUntypedStackRegion(MR)
or something of that nature. But I don't think it's necessarily worth the effort. Let's just special-case AllocaRegion
directly.
But, of course, @steakhal has correctly described our desired ultimate solution: get rid of region types entirely so that the crashy guesswork wasn't necessary in the first place.
We already have a way to avoid the guesswork: just pass the type explicitly, as in State->getSVal(MR, QualType)
. In this code, of course, we can't use that because there's no type to provide in the first place... because the entire memory load operation is completely made up, it does not correspond to anything in the program's source code. Normally, if it did correspond to a real memory load, we'd have the loaded type in the AST which we could have easily passed explicitly. But in our case there's simply no source of truth.
We could try to pass LocAndVal.second.getType()
. Which is a different kind of guesswork that may also fail or produce non-sensical results (especially considering that the conjured symbol's type is blindly defined as int
during invalidation) so I really don't think it's a good thing to do even if it dodges this specific crash.
Btw @SebastianKonplan, I'm just curious, how did you find this code? Did you reduce it from some real-world codebase or its the outcome of some fuzzing effort? The answer may affect how much effort we decide to invest here. To me, it feels like pretty rare to hit in real code. Once you do, ofc, that feels bad :D
Yes, this was reduced from real-world code. As noted in the description we hit this bug three times in our project codebase. We know the issues of alloca
and use it with the required care.
We already have a way to avoid the guesswork: just pass the type explicitly, as in State->getSVal(MR, QualType)
This is what I came up with as well (https://github.com/pskrgag/llvm-project/commit/808e3b9cf72314a7c342104638df1316f18ee073). In ExprEngine::evalCall
we know types of arguments and we can pass them to processPointerEscapedOnBind
and then to getSVal
. But we don't know them in ExprEngine::evalBind
, so I guess, my solution is not full.
<offtopic>
AllocaRegion sucks. But the current direction is to get rid of typed memregions, so making allocaregion typed would defeat that goal. The idea behing having that untyped is that at the allocation (at the alloca) we dont know what type will actually live at that storage. It just gives you a blob of memory.
When I was working with AllocaRegion
, I was thinking that perhaps it should be replaced by use of SymbolicRegion
(or turned into a subclass of SymbolicRegion
). Note that e.g. malloc
returns a SymbolicRegion
that's within the heap memory space (as opposed to the "opaque pointer" symbolic regions that are in unknown space) -- it would be natural to say that alloca
returns a SymbolicRegion
that's within the stack frame memory space where it's created.
This could simplify the code (we wouldn't need to have separate case:
s for AllocaRegion
) and ensure that AllocaRegion
s have an identity that can be e.g. marked as interesting.
@steakhal @haoNoQ What would you think about this? Do you know about any fundamental difference between alloca regions and e.g. malloc
memory blocks that would block this?
Side note, I think the reason why the Case (4) related check was implemented in such a hacky way, was because the MemRegion hierarchy was considered to be an implementation detail of RegionStore.
I'm surprised to hear this because I feel that MemRegion
subclasses are frequently mentioned in the checkers that I'm working on (and I definitely didn't try to avoid referring to them).
I think this view largely didn't stand the test of time. The class hierarchy is quite useful for that purpose; it's small and self-contained and very expressive as it gives you exactly the answers that you're looking for.
In this case we can try to keep this abstraction from leaking by defining another virtual method, eg. Store::isUntypedStackRegion(MR) or something of that nature. But I don't think it's necessarily worth the effort. [...]
...and naturally I agree that we shouldn't try to preserve this abstraction. (And perhaps we should refactor old code that's convoluted because it tries to preserve it.)
</offtopic>
Is there any further progress for this issue?
Is there any further progress for this issue?
Not as far as I know it.
Using the following script triggers a segfault:
Issue occurs with clang-15.0.7 in Ubuntu 22.04 and also with clang-16.0.6 (27+b1) in debian-testing. The following output is displayed:
In our project source code we see this type of bug three distinct times. Each time it correlates with the use of
alloca()
.