llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.11k stars 11.61k forks source link

Segfault in clang with "-analyze -analyzer-checker=core" #100712

Open SebastianKonplan opened 2 months ago

SebastianKonplan commented 2 months ago

Using the following script triggers a segfault:

#!/bin/sh

clang -cc1 -analyze -analyzer-checker=core - << __eof 
extern int fun2(unsigned int bufsize, char *buf);

int fun1(unsigned int bufsize)
{
    char *buf;

    if(0 == (*(void **)&buf = __builtin_alloca(bufsize)))
        return 0;

    return fun2(bufsize, buf);
}
__eof

Issue occurs with clang-15.0.7 in Ubuntu 22.04 and also with clang-16.0.6 (27+b1) in debian-testing. The following output is displayed:

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang-15 -cc1 -analyze -analyzer-checker=core -
1.      <eof> parser at end of file
2.      While analyzing stack: 
        #0 Calling fun1
3.      <stdin>:10:9: Error evaluating statement
4.      <stdin>:10:9: Error evaluating statement
 #0 0x00007d74df3043b1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf043b1)
 #1 0x00007d74df3020fe llvm::sys::RunSignalHandlers() (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf020fe)
 #2 0x00007d74df3048d6 (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf048d6)
 #3 0x00007d74ddc42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
 #4 0x00007d74e7dcefdf clang::ento::MemRegionManager::getElementRegion(clang::QualType, clang::ento::NonLoc, clang::ento::SubRegion const*, clang::ASTContext&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29cefdf)
 #5 0x00007d74e7e094ad (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2a094ad)
 #6 0x00007d74e7e0416d (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2a0416d)
 #7 0x00007d74e7d993d1 clang::ento::ExprEngine::processPointerEscapedOnBind(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, llvm::ArrayRef<std::pair<clang::ento::SVal, clang::ento::SVal>>, clang::LocationContext const*, clang::ento::PointerEscapeKind, clang::ento::CallEvent const*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29993d1)
 #8 0x00007d74e7db2e84 clang::ento::ExprEngine::evalCall(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNode*, clang::ento::CallEvent const&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29b2e84)
 #9 0x00007d74e7db26b4 clang::ento::ExprEngine::VisitCallExpr(clang::CallExpr const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29b26b4)
#10 0x00007d74e7d919c9 clang::ento::ExprEngine::Visit(clang::Stmt const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29919c9)
#11 0x00007d74e7d8d7f5 clang::ento::ExprEngine::ProcessStmt(clang::Stmt const*, clang::ento::ExplodedNode*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x298d7f5)
#12 0x00007d74e7d8d4ce clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x298d4ce)
#13 0x00007d74e7d74c08 clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2974c08)
#14 0x00007d74e7d74758 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2974758)
#15 0x00007d74e81d495f (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2dd495f)
#16 0x00007d74e81b20d8 (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2db20d8)
#17 0x00007d74e5e8ce9b clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0xa8ce9b)
#18 0x00007d74e7ab4ea7 clang::FrontendAction::Execute() (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x26b4ea7)
#19 0x00007d74e7a27fb6 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2627fb6)
#20 0x00007d74e7b32e4a clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2732e4a)
#21 0x00005f546ba589e0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/lib/llvm-15/bin/clang+0x139e0)
#22 0x00005f546ba56bab (/usr/lib/llvm-15/bin/clang+0x11bab)
#23 0x00005f546ba569fc clang_main(int, char**) (/usr/lib/llvm-15/bin/clang+0x119fc)
#24 0x00007d74ddc29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#25 0x00007d74ddc29e40 call_init ./csu/../csu/libc-start.c:128:20
#26 0x00007d74ddc29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#27 0x00005f546ba533b5 _start (/usr/lib/llvm-15/bin/clang+0xe3b5)
Segmentation fault (core dumped)

In our project source code we see this type of bug three distinct times. Each time it correlates with the use of alloca().

SebastianKonplan commented 2 months ago

At first glance this looks similar to #94526

chrchr-github commented 2 months ago

In trunk: https://godbolt.org/z/bd6nrafKn

llvmbot commented 2 months ago

@llvm/issue-subscribers-clang-static-analyzer

Author: Sebastian Priebe (SebastianKonplan)

Using the following script triggers a segfault: ``` #!/bin/sh clang -cc1 -analyze -analyzer-checker=core - << __eof extern int fun2(unsigned int bufsize, char *buf); int fun1(unsigned int bufsize) { char *buf; if(0 == (*(void **)&buf = __builtin_alloca(bufsize))) return 0; return fun2(bufsize, buf); } __eof ``` Issue occurs with clang-15.0.7 in Ubuntu 22.04 and also with clang-16.0.6 (27+b1) in debian-testing. The following output is displayed: ``` PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: clang-15 -cc1 -analyze -analyzer-checker=core - 1. <eof> parser at end of file 2. While analyzing stack: #0 Calling fun1 3. <stdin>:10:9: Error evaluating statement 4. <stdin>:10:9: Error evaluating statement #0 0x00007d74df3043b1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf043b1) #1 0x00007d74df3020fe llvm::sys::RunSignalHandlers() (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf020fe) #2 0x00007d74df3048d6 (/usr/lib/llvm-15/bin/../lib/libLLVM-15.so.1+0xf048d6) #3 0x00007d74ddc42520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520) #4 0x00007d74e7dcefdf clang::ento::MemRegionManager::getElementRegion(clang::QualType, clang::ento::NonLoc, clang::ento::SubRegion const*, clang::ASTContext&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29cefdf) #5 0x00007d74e7e094ad (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2a094ad) #6 0x00007d74e7e0416d (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2a0416d) #7 0x00007d74e7d993d1 clang::ento::ExprEngine::processPointerEscapedOnBind(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, llvm::ArrayRef<std::pair<clang::ento::SVal, clang::ento::SVal>>, clang::LocationContext const*, clang::ento::PointerEscapeKind, clang::ento::CallEvent const*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29993d1) #8 0x00007d74e7db2e84 clang::ento::ExprEngine::evalCall(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNode*, clang::ento::CallEvent const&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29b2e84) #9 0x00007d74e7db26b4 clang::ento::ExprEngine::VisitCallExpr(clang::CallExpr const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29b26b4) #10 0x00007d74e7d919c9 clang::ento::ExprEngine::Visit(clang::Stmt const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x29919c9) #11 0x00007d74e7d8d7f5 clang::ento::ExprEngine::ProcessStmt(clang::Stmt const*, clang::ento::ExplodedNode*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x298d7f5) #12 0x00007d74e7d8d4ce clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x298d4ce) #13 0x00007d74e7d74c08 clang::ento::CoreEngine::dispatchWorkItem(clang::ento::ExplodedNode*, clang::ProgramPoint, clang::ento::WorkListUnit const&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2974c08) #14 0x00007d74e7d74758 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2974758) #15 0x00007d74e81d495f (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2dd495f) #16 0x00007d74e81b20d8 (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2db20d8) #17 0x00007d74e5e8ce9b clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0xa8ce9b) #18 0x00007d74e7ab4ea7 clang::FrontendAction::Execute() (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x26b4ea7) #19 0x00007d74e7a27fb6 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2627fb6) #20 0x00007d74e7b32e4a clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/llvm-15/bin/../lib/libclang-cpp.so.15+0x2732e4a) #21 0x00005f546ba589e0 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/lib/llvm-15/bin/clang+0x139e0) #22 0x00005f546ba56bab (/usr/lib/llvm-15/bin/clang+0x11bab) #23 0x00005f546ba569fc clang_main(int, char**) (/usr/lib/llvm-15/bin/clang+0x119fc) #24 0x00007d74ddc29d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16 #25 0x00007d74ddc29e40 call_init ./csu/../csu/libc-start.c:128:20 #26 0x00007d74ddc29e40 __libc_start_main ./csu/../csu/libc-start.c:379:5 #27 0x00005f546ba533b5 _start (/usr/lib/llvm-15/bin/clang+0xe3b5) Segmentation fault (core dumped) ``` In our project source code we see this type of bug three distinct times. Each time it correlates with the use of `alloca()`.
pskrgag commented 1 month ago

@steakhal

Is it good issue to get familiar with CSA internals? If so, I'd really like to take it.

steakhal commented 1 month ago

@steakhal

Is it good issue to get familiar with CSA internals? If so, I'd really like to take it.

Go ahead. I can see a reinterpret cast there, and those can be quite tricky at best. But you could as well have some luck. And somrting easy to fix.

Feel free to have a look, and as far as I'm concerned, I'll review PRs, and first come first served for landing PRs. If womeone conflicts with an issue, they should collaborate. So im not really for assigning issues, but you can work on it. Let me know if you have questions.

pskrgag commented 1 month ago

Got it, thanks!

pskrgag commented 1 month ago

That's indeed tricky one. I did some analysis, but don't know what would be the correct solution...

So, the bug happens here:

  if (T.isNull()) {
    if (const auto *TVR = dyn_cast<TypedValueRegion>(MR))
      T = TVR->getValueType();
    else if (const auto *TR = dyn_cast<TypedRegion>(MR))
      T = TR->getLocationType()->getPointeeType();
    else if (const auto *SR = dyn_cast<SymbolicRegion>(MR))
      T = SR->getPointeeStaticType();
  }
  assert(!T.isNull() && "Unable to auto-detect binding type!");  <-- assert

Code comes here from processPointerEscapedOnBind() on calling fun2(). MR is AllocaRegion, which has no type.

I found out that if we change cast type to char * instead of void *, then problem disappears, since in that case MR would be TypedValueRegion (https://godbolt.org/z/W9j4Ys4oz). This happens because Loc for DeclRefExpr points to TypedValueRegion at the point of ExprEngine::VisitCast, while in case of void * it points to AllocaRegion.

I did not find where that binding happens. For some reason bp on RegionStoreManager::{bind, Bind} do not work, so I guess, I am just lost.

One solution I came up with is to pass QualType from processPointerEscapedOnBind(), since we know it at call site. That way QualType would be char * and problem disappears. (See https://github.com/pskrgag/llvm-project/tree/csaAllocaCrash). I am not completely sure this is right solution and it won't just hide bigger problem.

Maybe completely right solution would be to make AllocaRegion a TypedValueRegion?

steakhal commented 1 month ago

AllocaRegion sucks. But the current direction is to get rid of typed memregions, so making allocaregion typed would defeat that goal. The idea behing having that untyped is that at the allocation (at the alloca) we dont know what type will actually live at that storage. It just gives you a blob of memory.

I havent read the full context here, but making allocaregion typed is likely not the way to go. Tomorrow i plan to come back to this to be a bit more constructive :)

steakhal commented 1 month ago

Btw @SebastianKonplan, I'm just curious, how did you find this code? Did you reduce it from some real-world codebase or its the outcome of some fuzzing effort? The answer may affect how much effort we decide to invest here. To me, it feels like pretty rare to hit in real code. Once you do, ofc, that feels bad :D

haoNoQ commented 1 month ago

Oh, it's the infamous Case (4) in processPointerEscapedOnBind(). We're at fun2() trying to escape values stored in buf after an overwrite by the conjured symbol. In order to do that, we see if the bind even succeeds (can be represented by the Store at all), by trying to put it there and load it back. It's super hacky. Ultimately we should find a better way to do this. At a glance, I'm happy to skip this step entirely in case of AllocaRegion, and blindly assume that it escapes. (It's somewhat likely that AllocaRegion is exactly why this step exists in the first place, so it may become dead code if we skip it this way. But I'm ok with that too, unless we quickly find a way to eliminate the source of confusion.)

haoNoQ commented 1 month ago

Side note, I think the reason why the Case (4) related check was implemented in such a hacky way, was because the MemRegion hierarchy was considered to be an implementation detail of RegionStore. It was assumed to be undesirable for the rest of the engine (let alone the checkers) to implement custom behaviors for the specific subclasses of MemRegion; instead, they intended the high-level interface of getSVal() to be sufficient for all purposes.

I think this view largely didn't stand the test of time. The class hierarchy is quite useful for that purpose; it's small and self-contained and very expressive as it gives you exactly the answers that you're looking for.

In this case we can try to keep this abstraction from leaking by defining another virtual method, eg. Store::isUntypedStackRegion(MR) or something of that nature. But I don't think it's necessarily worth the effort. Let's just special-case AllocaRegion directly.

haoNoQ commented 1 month ago

But, of course, @steakhal has correctly described our desired ultimate solution: get rid of region types entirely so that the crashy guesswork wasn't necessary in the first place.

We already have a way to avoid the guesswork: just pass the type explicitly, as in State->getSVal(MR, QualType). In this code, of course, we can't use that because there's no type to provide in the first place... because the entire memory load operation is completely made up, it does not correspond to anything in the program's source code. Normally, if it did correspond to a real memory load, we'd have the loaded type in the AST which we could have easily passed explicitly. But in our case there's simply no source of truth.

We could try to pass LocAndVal.second.getType(). Which is a different kind of guesswork that may also fail or produce non-sensical results (especially considering that the conjured symbol's type is blindly defined as int during invalidation) so I really don't think it's a good thing to do even if it dodges this specific crash.

SebastianKonplan commented 1 month ago

Btw @SebastianKonplan, I'm just curious, how did you find this code? Did you reduce it from some real-world codebase or its the outcome of some fuzzing effort? The answer may affect how much effort we decide to invest here. To me, it feels like pretty rare to hit in real code. Once you do, ofc, that feels bad :D

Yes, this was reduced from real-world code. As noted in the description we hit this bug three times in our project codebase. We know the issues of alloca and use it with the required care.

pskrgag commented 1 month ago

We already have a way to avoid the guesswork: just pass the type explicitly, as in State->getSVal(MR, QualType)

This is what I came up with as well (https://github.com/pskrgag/llvm-project/commit/808e3b9cf72314a7c342104638df1316f18ee073). In ExprEngine::evalCall we know types of arguments and we can pass them to processPointerEscapedOnBind and then to getSVal. But we don't know them in ExprEngine::evalBind , so I guess, my solution is not full.

NagyDonat commented 1 month ago

<offtopic>

AllocaRegion sucks. But the current direction is to get rid of typed memregions, so making allocaregion typed would defeat that goal. The idea behing having that untyped is that at the allocation (at the alloca) we dont know what type will actually live at that storage. It just gives you a blob of memory.

When I was working with AllocaRegion, I was thinking that perhaps it should be replaced by use of SymbolicRegion (or turned into a subclass of SymbolicRegion). Note that e.g. malloc returns a SymbolicRegion that's within the heap memory space (as opposed to the "opaque pointer" symbolic regions that are in unknown space) -- it would be natural to say that alloca returns a SymbolicRegion that's within the stack frame memory space where it's created.

This could simplify the code (we wouldn't need to have separate case:s for AllocaRegion) and ensure that AllocaRegions have an identity that can be e.g. marked as interesting.

@steakhal @haoNoQ What would you think about this? Do you know about any fundamental difference between alloca regions and e.g. malloc memory blocks that would block this?


Side note, I think the reason why the Case (4) related check was implemented in such a hacky way, was because the MemRegion hierarchy was considered to be an implementation detail of RegionStore.

I'm surprised to hear this because I feel that MemRegion subclasses are frequently mentioned in the checkers that I'm working on (and I definitely didn't try to avoid referring to them).

I think this view largely didn't stand the test of time. The class hierarchy is quite useful for that purpose; it's small and self-contained and very expressive as it gives you exactly the answers that you're looking for.

In this case we can try to keep this abstraction from leaking by defining another virtual method, eg. Store::isUntypedStackRegion(MR) or something of that nature. But I don't think it's necessarily worth the effort. [...]

...and naturally I agree that we shouldn't try to preserve this abstraction. (And perhaps we should refactor old code that's convoluted because it tries to preserve it.) </offtopic>

SebastianKonplan commented 1 week ago

Is there any further progress for this issue?

steakhal commented 1 week ago

Is there any further progress for this issue?

Not as far as I know it.