llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.12k stars 12.01k forks source link

Clang-18 crash when compiled with -fsanitize=dataflow -fwhole-program-vtables -flto -c #74103

Open iamanonymouscs opened 11 months ago

iamanonymouscs commented 11 months ago

Clang-18 with -fsanitize=dataflow -fwhole-program-vtables -flto -c crashes on the test case. Compiler explorer: https://gcc.godbolt.org/z/vf5rGThx5

$cat mutant.c
typedef struct FILE { int i; } FILE;
#ifdef __cplusplus
extern "C"
#endif
int fprintf (FILE *, const char *, ...);

int
main ()
{
  ((void (*)()) fprintf) ();
  return 0;
}

$clang-18 -fsanitize=dataflow -fwhole-program-vtables -flto -c mutant.c
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang-18 -fsanitize=dataflow -fwhole-program-vtables -flto -c mutant.c
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x00007f18cec7e266 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xd2b266)
 #1 0x00007f18cec7c170 llvm::sys::RunSignalHandlers() (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xd29170)
 #2 0x00007f18cec7d8c4 llvm::sys::CleanupOnSignal(unsigned long) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xd2a8c4)
 #3 0x00007f18cebcbbb0 (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xc78bb0)
 #4 0x00007f18d975f980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
 #5 0x00007f18cf707846 (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x17b4846)
 #6 0x00007f18cf7071fc (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x17b41fc)
 #7 0x00007f18cf6e796e (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x179496e)
 #8 0x00007f18cf6e1942 llvm::BitcodeWriter::writeModule(llvm::Module const&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x178e942)
 #9 0x00007f18cf6ebe35 llvm::WriteBitcodeToFile(llvm::Module const&, llvm::raw_ostream&, bool, llvm::ModuleSummaryIndex const*, bool, std::array<unsigned int, 5ul>*) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x1798e35)
#10 0x00007f18cf70b999 llvm::BitcodeWriterPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0x17b8999)
#11 0x00007f18d747326d (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x1af026d)
#12 0x00007f18cedf8864 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xea5864)
#13 0x00007f18d7467d43 (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x1ae4d43)
#14 0x00007f18d7460d52 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x1addd52)
#15 0x00007f18d77f75fe (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x1e745fe)
#16 0x00007f18d6413866 clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0xa90866)
#17 0x00007f18d826b645 clang::FrontendAction::Execute() (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x28e8645)
#18 0x00007f18d81e9cc4 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x2866cc4)
#19 0x00007f18d82e61c0 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x29631c0)
#20 0x000056146e380837 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/lib/llvm-18/bin/clang+0x13837)
#21 0x000056146e37d905 (/usr/lib/llvm-18/bin/clang+0x10905)
#22 0x00007f18d7e81909 (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x24fe909)
#23 0x00007f18cebcb94c llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/usr/lib/llvm-18/bin/../lib/libLLVM-18.so.1+0xc7894c)
#24 0x00007f18d7e812ae clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x24fe2ae)
#25 0x00007f18d7e490f1 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x24c60f1)
#26 0x00007f18d7e4933e clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x24c633e)
#27 0x00007f18d7e6539c clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/usr/lib/llvm-18/bin/../lib/libclang-cpp.so.18+0x24e239c)
#28 0x000056146e37d25c clang_main(int, char**, llvm::ToolContext const&) (/usr/lib/llvm-18/bin/clang+0x1025c)
#29 0x000056146e38af32 main (/usr/lib/llvm-18/bin/clang+0x1df32)
#30 0x00007f18cd162c87 __libc_start_main /build/glibc-CVJwZb/glibc-2.27/csu/../csu/libc-start.c:344:0
#31 0x000056146e37a1ea _start (/usr/lib/llvm-18/bin/clang+0xd1ea)
clang-18: error: clang frontend command failed with exit code 139 (use -v to see invocation)
Ubuntu clang version 18.0.0 (++20231018091808+48a53509e851-1~exp1~20231018091910.1571)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang-18: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-18: note: diagnostic msg: /tmp/mutant-39b2dc.c
clang-18: note: diagnostic msg: /tmp/mutant-39b2dc.sh
clang-18: note: diagnostic msg: 

********************
ilovepi commented 11 months ago

Seems like this triggers an assertion failure in InstrTypes.h.

clang: llvm/include/llvm/IR/InstrTypes.h:1503: void llvm::CallBase::setCalledFunction(FunctionType *, Value *): Assertion `getType() == FTy->getReturnType()' failed.
ilovepi commented 11 months ago

@jyknight I see you added this comment and assert, do you have an idea about what's going on here? I assume from skimming the file and the stack trace that there is probably an invalid pointer assigned here and things bumble along until we try to write the bitcode out. Feels like maybe this should be a fatal error, but I haven't figured out the calling context that triggers the assert yet to know if that even makes sense, or if there is just a missing check at the call site.

ilovepi commented 11 months ago

ah, so this seems related to the fact that fprintf() is special, and is mentioned in the system ignore list referenced in the -fsanitize-system-ignorelist= option passed to -cc1. Not sure how we'd like to handle that edgecase, though.

jyknight commented 11 months ago

The reason I added the assert is that it didn't seem likely to be correct to update the FTy member, but not the Value type of the call. But I didn't know what behavior would actually change if I fixed that. The answer was apparently: nobody ever did that, until now.

Notably, the only other function in that class which updates FTy, mutateFunctionType, does adjust the Value type too.

I don't know if this assert was catching a bug in your code, or if it would've been correct for this API to update the type in your use-case?

ilovepi commented 11 months ago

I'm not sure what use case @iamanonymouscs had. Judging from the file name, I'd hazard this was generated from some kind of fuzzing. I was just responding to the bug report, since a reproducer is usually straightforward to run down, and ICE is one of my pet peeves.

DFSan isn't something I've looked into all that much, at least not on an implementation level, so I was hoping you'd have an idea. I'll try to set aside some time on Monday to dig a little deeper and figure out a good solution.

ilovepi commented 11 months ago

So, near as I can tell this is sort of a weird case, since casting a function pointer and calling it like that is UB.

The crash seems to be caused by the interaction of a few things.

First, the above assert fires because fprintf is listed asunistrumented in the default AbiList file, changing that or the funcion name will prevent the crash.

If we ignore the assert, like a release build, we eventually still run into an issue because CanonicalizeAliasPass will transform the call from:

  call void (...) @fprintf()

into

  call i32 (ptr, ptr, ...) @fprintf()

This causes an error because the function requires more parameters than were provided. Seems like by default the verifier is disable in the -cc1 command line, and would otherwise fire.

Even without the verifier, though, in an assert build, this would eventually get caught in IstrTypes.h when the index is out of range, but instead it returns something out of bounds as a pointer value and eventually faults when derefing that pointer.

I guess the only thing to do is to try and detect the situation in the dfsan pass, but I'm not sure yet on how to do that. Maybe we need to check if the function has an alias before we call setCalledFunction() in visitWrappedCallBase()?