Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

lli -force-interpreter tries to lookup symbols that still have \x01 in their name #5978

Open Quuxplusone opened 14 years ago

Quuxplusone commented 14 years ago
Bugzilla Link PR5480
Status NEW
Importance P normal
Reported by Timo Juhani Lindfors (timo.lindfors@iki.fi)
Reported on 2009-11-13 07:14:01 -0800
Last modified on 2010-03-01 17:06:19 -0800
Version unspecified
Hardware PC Linux
CC daramos@stanford.edu, llvm-bugs@lists.llvm.org, paul@floorball-flamingos.nl
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Steps to reproduce:
1) cat > testcase.c <<EOF
#define _FILE_OFFSET_BITS 64
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
    struct stat fileinfo;

    stat("/etc/motd", &fileinfo);
    printf("%d\n", fileinfo.st_ino);
    return 0;
}
EOF
2) llvm-gcc -c -o testcase.bc -emit-llvm -O2 testcase.c
3) lli -force-interpreter testcase.bc

Expected results:
3) lli runs the program and it prints the inode number of /etc/motd

Actual results:
3) LLVM ERROR: Tried to execute an unknown external function: i32 (i32, i8*, {
i64, i16, i32, i32, i32, i32, i32, i64, i16, i64, i32, i64, { i32, i32 }, {
i32, i32 }, { i32, i32 }, i64 }*)* __xstat64

More info:
1) lli tries to dlsym "\x01__xstat64" but the \x01 in the error message just is
not visible in terminal.
2) distro is debian stable
3) llvm revision is 86985
4) llvm-gcc revision is 86986
5) A very crude patch to fix the issue is

Index: lib/System/DynamicLibrary.cpp
===================================================================
--- lib/System/DynamicLibrary.cpp       (revision 86985)
+++ lib/System/DynamicLibrary.cpp       (working copy)
@@ -70,6 +70,9 @@
 }

 void* DynamicLibrary::SearchForAddressOfSymbol(const char* symbolName) {
+  if (symbolName && symbolName[0] == '\1') {
+      symbolName++;
+  }
   // First check symbols added via AddSymbol().
   if (ExplicitSymbols) {
     std::map<std::string, void *>::iterator I =

but I don't yet understand enough LLVM to figure out where this really should
be done. In Mangler perhaps?
Quuxplusone commented 14 years ago

This check should go in the client of SearchForAddressOfSymbol, not in the implementation. For example, the JIT does this dance in its JIT::getPointerToNamedFunction function.

Quuxplusone commented 14 years ago
Curious, what's so special about the \x01 character? Isn't this issue caused by
some kind of pointer having the wrong address?

22:54|melis@juggle2:~> valgrind ~/llvm2.6-debug/bin/lli -force-interpreter
doh.bc
==24052== Memcheck, a memory error detector
==24052== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==24052== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==24052== Command: /home/melis/llvm2.6-debug/bin/lli -force-interpreter doh.bc
==24052==
==24052== Invalid read of size 4
==24052==    at 0x40173D3: ??? (in /lib/ld-2.9.so)
==24052==    by 0x42918A3: ??? (in /lib/libc-2.9.so)
==24052==    by 0x4291C79: _dl_sym (in /lib/libc-2.9.so)
==24052==    by 0x4068E07: ??? (in /lib/libdl-2.9.so)
==24052==    by 0x400E965: ??? (in /lib/ld-2.9.so)
==24052==    by 0x40690FB: ??? (in /lib/libdl-2.9.so)
==24052==    by 0x4068D92: dlsym (in /lib/libdl-2.9.so)
==24052==    by 0x8B548FC:
llvm::sys::DynamicLibrary::SearchForAddressOfSymbol(char const*)
(DynamicLibrary.cpp:82)
==24052==    by 0x8816B32:
llvm::sys::DynamicLibrary::SearchForAddressOfSymbol(std::string const&)
(DynamicLibrary.h:62)
==24052==    by 0x8813F55: lookupFunction(llvm::Function const*)
(ExternalFunctions.cpp:107)
==24052==    by 0x881408F:
llvm::Interpreter::callExternalFunction(llvm::Function*,
std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)
(ExternalFunctions.cpp:257)
==24052==    by 0x880AC9B: llvm::Interpreter::callFunction(llvm::Function*,
std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)
(Execution.cpp:1306)
==24052==  Address 0x42ef6ac is 28 bytes inside a block of size 29 alloc'd
==24052==    at 0x40262CE: operator new(unsigned int) (in
/usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==24052==    by 0x410053F: std::string::_Rep::_S_create(unsigned int, unsigned
int, std::allocator<char> const&) (in /usr/lib/gcc/i686-pc-linux-
gnu/4.3.4/libstdc++.so.6.0.10)
==24052==    by 0x4100FE7: std::string::_Rep::_M_clone(std::allocator<char>
const&, unsigned int) (in /usr/lib/gcc/i686-pc-linux-
gnu/4.3.4/libstdc++.so.6.0.10)
==24052==    by 0x4101F9B: std::string::reserve(unsigned int) (in
/usr/lib/gcc/i686-pc-linux-gnu/4.3.4/libstdc++.so.6.0.10)
==24052==    by 0x860C425: std::basic_string<char, std::char_traits<char>,
std::allocator<char> > std::operator+<char, std::char_traits<char>,
std::allocator<char> >(char const*, std::basic_string<char,
std::char_traits<char>, std::allocator<char> > const&) (basic_string.tcc:675)
==24052==    by 0x8813F47: lookupFunction(llvm::Function const*)
(ExternalFunctions.cpp:107)
==24052==    by 0x881408F:
llvm::Interpreter::callExternalFunction(llvm::Function*,
std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)
(ExternalFunctions.cpp:257)
==24052==    by 0x880AC9B: llvm::Interpreter::callFunction(llvm::Function*,
std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&)
(Execution.cpp:1306)
==24052==    by 0x880B3D3: llvm::Interpreter::visitCallSite(llvm::CallSite)
(Execution.cpp:902)
==24052==    by 0x8812011: llvm::Interpreter::visitCallInst(llvm::CallInst&)
(Interpreter.h:166)
==24052==    by 0x881203B: llvm::InstVisitor<llvm::Interpreter,
void>::visitCall(llvm::CallInst&) (Instruction.def:162)
==24052==    by 0x8812827: llvm::InstVisitor<llvm::Interpreter,
void>::visit(llvm::Instruction&) (Instruction.def:162)
==24052==
LLVM ERROR: Tried to execute an unknown external function: i32 (i32, i8*, {
i64, i16, i32, i32, i32, i32, i32, i64, i16, i64, i32, i64, { i32, i32 }, {
i32, i32 }, { i32, i32 }, i64 }*)* __xstat64
Quuxplusone commented 14 years ago
The \x01 prefix is already present in the bitcode file

%0 = call i32 @"\01__xstat64"(i32 3, i8* getelementptr inbounds ([10 x i8]*
@.str, i32 0, i32 0), %struct.stat* %fileinfo) nounwind ; <i32> [#uses=0]

so I don't think we can blame the interpreter. Mangler.cpp has some code to
detect this prefix so I am assuming the presence of the prefix is not an error.
However, there really should be some #define for it so that it would be easier
to find from GCC, clang and llvm source code.
Quuxplusone commented 14 years ago
\01 is added in llvm-gcc as a result of a series of glibc ifdefs surrounding
large file (64-bit offset) support on 32-bit platforms.

In gcc/varasm.c:

set_user_assembler_name (tree decl, const char *name)
{
...
   /* If the name isn't an LLVM intrinsic, add a starting '\1' character to
      indicate that the target assembler shouldn't modify the name.  If it *is*
      an LLVM intrinsic name, just set the name, to support code like this:
          unsigned bswap(unsigned) __asm__("llvm.bswap");  */
...
}

It appears that this character should be stripped out later, but this isn't
occurring. I'm looking into a cause/fix...
Quuxplusone commented 14 years ago

Alright, I agree with Chris Lattner's assessment now. The '\1' sentinel has to stay in the LLVM bitcode until target code is generated. In the case of the interpreter, it's the interpreter's responsibility to fix the naming at runtime. This needs to be done in the KLEE project as well, which is how I stumbled upon the problem.