Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

RTDyldMemoryManager::getSymbolAddressInProcess not working on Windows #28698

Open Quuxplusone opened 8 years ago

Quuxplusone commented 8 years ago
Bugzilla Link PR28699
Status REOPENED
Importance P normal
Reported by James Holderness (j4_james@hotmail.com)
Reported on 2016-07-25 15:34:51 -0700
Last modified on 2018-05-25 11:36:53 -0700
Version trunk
Hardware PC Windows NT
CC 6yearold@gmail.com, dblaikie@gmail.com, lhames@gmail.com, llvm-bugs@lists.llvm.org, rafael@espindo.la
Fixed by commit(s)
Attachments getSymbolAddressInProcess.patch (1508 bytes, text/plain)
MCJIT.cpp.patch (2618 bytes, text/plain)
RTDyldMemoryManager fix.patch (6164 bytes, text/plain)
Unmangled getSymbolAddressInProcess.patch (13297 bytes, text/plain)
Unmangled getSymbolAddressInProcess 2.patch (12850 bytes, text/plain)
Blocks
Blocked by
See also

Originally the RTDyldMemoryManager::getSymbolAddressInProcess method included code that demangled the symbol name (i.e. removed the underscore prefix) on all platforms, but that behaviour was changed in revision 262657 so it's now only applied on the APPLE platform. This change seems to have caused the method to fail on Windows.

For example, when testing the BrainF example app, I was able to get a sample helloworld script to compile to bitcode, but when I tried to execute that bitcode with the lli interpreter, it produced the error:

LLVM ERROR: Program used external function '_free' which could not be resolved!

Once I updated the code that strips the underscore so that it was applied on both the APPLE and _WIN32 platforms, the interpreter was then able to execute the bitcode successfully.

That said, I'm not sure that's necessarily the correct solution. Is it possible there may be cases where the name genuinely does need to be left with the underscore intact on Windows? The original code tried to lookup the name both with and without the prefix, which may be the safer option.

In case it makes any difference, I'm testing with Visual Studio 2015 Community Edition on Windows 10 and the LLVM suite was built from trunk (currently revision 276679) with cmake -G "Visual Studio 14 2015"

Quuxplusone commented 8 years ago
In case it's not obvious, the exact bit of code I'm referring to is here:
https://reviews.llvm.org/rL262657#C38049635NL269

My solution was to change:

  #ifdef __APPLE__

to:

  #if defined(__APPLE__) || defined(_WIN32)
Quuxplusone commented 8 years ago
Hi James,

My (very limited) understanding of windows name mangling is that it is more
complex than Darwin or Linux. See e.g. https://msdn.microsoft.com/en-
us/library/56h2zst2.aspx

We probably (eventually) want something like:

#if (__APPLE__)
  // strip leading underscore
#elif (_WIN32)
  win32demangle(sym);
#endif

I think your change is a step in the right direction though. Do you have a
patch that you'd like to share?

Do you have commit access?

Cheers,
Lang.
Quuxplusone commented 8 years ago

I've had a chance to investigate this further, and I'm now of the opinion that there's not much point in trying to support further demangling of symbol names on Windows (I mean other than underscore demangling proposed above).

As far as I can tell, the DynamicLibrary::SearchForAddressOfSymbol method will only lookup functions from a fixed set of DLLs - those that are imported by the lli interpreter. The majority of those are Windows APIs, and when called from from C code compiled with clang, those APIs end up as dllimport references. Those dllimport references then translate to indirect calls in the assembly, where the mangled symbol is a reference to the import table entry rather than the function itself. So demangling that symbol to lookup the address of the function is pointless, and in fact will crash, because it's really expecting a memory address.

The end result is that SearchForAddressOfSymbol is only going to work with function calls that are statically linked (i.e. no dllimport), and which are also found in the fixed set of DLLs imported by the lli interpreter. And I think that just leaves you with C runtime library functions which are probably all cdecl anyway (and thus handled by the simple underscore demangling), or possibly C++ runtime library references which aren't meant to be demangled.

That's not to say that SearchForAddressOfSymbol couldn't be extended to support dllimport references, but that seems like a separate issue to me, and may well be considered out of scope for lli.

Quuxplusone commented 8 years ago
Hi James,

Thanks so much for the investigation!

"As far as I can tell, the DynamicLibrary::SearchForAddressOfSymbol method will
only lookup functions from a fixed set of DLLs"

I think it should also search anything loaded dynamically via
DynamicLibrary::LoadLibraryPermanently (assuming that is supported on Windows).

"The end result is that SearchForAddressOfSymbol is only going to work with
function calls that are statically linked (i.e. no dllimport), and which are
also found in the fixed set of DLLs imported by the lli interpreter. And I
think that just leaves you with C runtime library functions which are probably
all cdecl anyway (and thus handled by the simple underscore demangling), or
possibly C++ runtime library references which aren't meant to be demangled."

Perhaps we need to re-think the semantics of DynamicLibrary
SearchForAddressOfSymbol on Windows. On *nix (including Linux and Darwin)
SearchForAddressOfSymbol is meant to be analogous to the dlsym library
function: it searches for symbols based on their C symbol name. If DLLs on
windows don't have a universal mangling rule then "search based on C symbol
name" is difficult (or perhaps impossible) to implement with the existing
interface. I'll ask some of the Windows experts in IRC tomorrow.

For now, would you be happy with me committing your existing fix?
Quuxplusone commented 8 years ago
> I think it should also search anything loaded dynamically via
> DynamicLibrary::LoadLibraryPermanently (assuming that is supported on
> Windows).

That does work on Windows, but the only way I could see that being called from
lli is via the -load command line option, and the documentation for -load says
it "causes lli to load the plugin (shared object) named pluginfilename and use
it for optimization." So while technically that is a way to get an additional
DLL into the search set, that doesn't sound to me like it was intended to be
used that way. Am I misreading that?

In any event, if you are going to have to pass in extra command line parameters
to get lli to execute your input, I would think it might make more sense (on
Windows at least) to link in the import library for the DLL (via the -extra-
archive option) rather than the DLL itself. That would in theory avoid the
whole name demangling issue altogether, since the import library already has
all the information required to map mangled names to their corresponding DLL
and export names.

That said, the MCJIT::findSymbol implementation doesn't appear to support
import libs at the moment, so that's a non-starter for now. But longer term
that seems like it might be a better approach to take.

> Perhaps we need to re-think the semantics of DynamicLibrary
> SearchForAddressOfSymbol on Windows. On *nix (including Linux and Darwin)
> SearchForAddressOfSymbol is meant to be analogous to the dlsym library
> function: it searches for symbols based on their C symbol name.

I think I need to get a better understanding of how things works on the *nix
side before I can voice a useful opinion on this. I'm still not really sure
what exactly is expected of lli on a given set of inputs. But I suspect Windows
is always going to end up being more complicated because of its need for an
import library when linking to a DLL (while I believe *nix just uses the .so
directly).

> For now, would you be happy with me committing your existing fix?

Yep, I'm happy for you to commit if nobody else has objected. It may not be the
perfect long term solution, but it at least gets the basic C libs working.
Quuxplusone commented 8 years ago
Hi James,

> Yep, I'm happy for you to commit if nobody else has objected. It may not be
the
> perfect long term solution, but it at least gets the basic C libs working.

Sounds good to me. Committed in r279016, with minor modifications (I added a
check for !defined(_WIN64), since the leading '_' mangling appears to only
apply to 32-bit apps). Hope that's ok - if that change affects your use-case
let me know and we'll dig deeper.

Thanks again for working on this!

- Lang.
Quuxplusone commented 8 years ago

Re-opening due to http://bb.pgr.jp/builders/ninja-clang-i686-msc19-R/builds/5963

Looks like this breaks calls to chkstk. I'm investigating now, and will revert shortly if I don't see an obvious fix.

Quuxplusone commented 8 years ago
Reverted in r279029: this breaks win32-elf tests (see the triple for the
failing tests at http://bb.pgr.jp/builders/ninja-clang-i686-msc19-
R/builds/5963).

The right solution to this is *probably* to deprecate the win32-elf hack, but
it's not clear how many people are using this. This will require some more
thought...
Quuxplusone commented 8 years ago
First off, I must apologise for giving you a patch that broke the build. I
should have caught that.

Having looked into the cause of the failure, though, I think that area of the
code is still technically broken, even with the patch reverted. The problem is
we're trying to demangle the symbol names based on the architecture of the lli
build, when it should be based on the target triple.

For example, I can produce the same kind of error on Linux by calling lli with
the target triple set like this:

  lli -mtriple=x86_64-unknown-linux-gnu-macho

And I suspect you might get it to fail similarly on a Mac with a target triple
looking something like this:

  lli -mtriple=x86_64-apple-darwin-gnu-elf

It's just a matter of overriding the target object format (coff/elf/macho) in
such a way that the mangling mode is different from the build default. This may
not be a problem in practice, and maybe you want to deprecate that sort of
thing eventually, but it does suggest that something isn't quite right with the
current approach.

Now the simplest solution could just be to go back to the old
getSymbolAddressInProcess implementation which attempted to resolve the name
both with and without the underscore prefix removed. That way it doesn't have
to care what kind of mangling the target architecture is performing.

Obviously that's not ideal though - it's inefficient, and could potentially
cause symbols to be resolved incorrectly (which I believe is why you changed
this code in the first place). However, it's worth mentioning that this is
already a problem in the LinkingSymbolResolver::findSymbol implementation which
uses the same kind of hack.

That brings me to the second possible solution. The LinkingSymbolResolver has
access to the target machine's DataLayout, and from that it can lookup the
GlobalPrefix with which the symbol would have been mangled (if any). In theory
it should then be perfectly capable of demangling the name without the need for
any of this guess work.

We'd still then need to extend the JITSymbolResolver interface to pass in the
mangling prefix so it would also know how to demangle accurately. Otherwise we
could change the semantics of the resolver so we can pass in the symbol name
already demangled. Either way would potentially have backwards compatibility
issues, but this seems a better approach in the long-term.

What do you think?
Quuxplusone commented 8 years ago
> First off, I must apologise for giving you a patch that broke the build. I
should have caught that.

No worries. If you're working on Windows it's likely that this failure wouldn't
have shown up when you tested. That's a common situation for all LLVM
developers. We rely on the buildbots to let us know when we've broken configs
we don't have access to.

> The problem is we're trying to demangle the symbol names based on the
architecture of the lli build, when it should be based on the target triple.

Your analysis of the problem is spot on.

The underlying issue, which is intended behavior, is the use of mangled names
for lookup. This allows JIT symbol resolution rules to match those in the
static linker, which is ideal for guaranteeing that JIT'd code behavior matches
that of ahead-of-time compiled behavior. As you've identified however, this
falls flat in getSymbolAddressInProcess if the JIT'd code's mangling rules
don't match the host platform.

We could thread a DataLayout reference through symbol lookup, but I want to
avoid the overhead if possible. I think we can solve this by redefining
getSymbolAddressInProcess to take a demangled name (It's supposed to be a
wrapper around dlsym anyway, so that's a saner API), and having the user wrap
the symbol name in an explicit demangler call. So instead of:

if (auto Sym = SearchJIT(Name))
  return Sym;
return getSymbolAddressInProcess(Name);

We would have:

if (auto Sym = SearchJIT(Name))
  return Sym;
return getSymbolAddressInProcess(Demangle(DL, Name));

Any interest in writing a patch for this? :)

I'm out on vacation this week, but happy to review patches this or fix it
myself when I get back next week.

Cheers,
Lang.
Quuxplusone commented 8 years ago

Just so you know, I have started looking into this, but it's turned out to be a little more complicated than I initially thought. If you get back next week and I haven't made any progress, I don't mind if you just want to get on with it and fix it yourself. It'll have been a good learning experience for me either way.

But from what I've been able to figure out so far, the places that are calling getSymbolAddressInProcess don't always have access to the DataLayout, so they wouldn't necessarily be in a position to demangle the name at that point either. That has made me think we'll probably need to handle the demangling further up the call stack.

Anyway, once I'm a little more certain of myself, I'd be happy to try putting together a patch. It might just take me a while to get that far.

Quuxplusone commented 8 years ago

Attached MCJIT.cpp.patch (2618 bytes, text/plain): Proposed fix for MCJIT::findSymbol mangling inconsistency

Quuxplusone commented 8 years ago

Hi James,

Sorry for the delay in reviewing this, and thanks for being brave enough to venture into this thicket - I didn't realise how bad it was.

I'll check the patch out and have a review for you by the end of the week.

Quuxplusone commented 8 years ago

Attached getSymbolAddressInProcess.patch (1508 bytes, text/plain): Proposed fix for getSymbolAddressInProcess on Windows

Quuxplusone commented 7 years ago

Attached RTDyldMemoryManager fix.patch (6164 bytes, text/plain): Proposed fix for getSymbolAddressInProcess on Windows

Quuxplusone commented 7 years ago

Hi James,

I'm just running through your previous patch - all tests pass, but I want to make sure I really understand the changes.

Regarding getSymbolAddressInProcess, I think demangling should be the caller's job. That way getSymbolAddressInProcess matches the behavior of dlsym. We'll need to add a demangle function that callers can use, and that could take an optional datalayout (if we want to support 'best-guess' demangling) or a mandatory one (if we determine that all callers can be reasonably expected to have access to a data layout).

Quuxplusone commented 7 years ago

Ok. The MCJIT::findSymbol mangling inconsistency fix looks good to me. I've committed it in r281238. Thanks again for that!

I don't have time to work on the getSymbolAddressInProcess problems just yet, but I'm happy to review patches. I'll try to turn the next ones around a little more quickly - I appreciate your patience.

I'll be in #llvm on irc.oftc.net as 'lhames' too if you want to chat about any of this stuff in real time too.

Thanks again for all the help. :)

Quuxplusone commented 7 years ago

Attached Unmangled getSymbolAddressInProcess.patch (13297 bytes, text/plain): Alternative fix for getSymbolAddressInProcess on Windows

Quuxplusone commented 7 years ago

Attached Unmangled getSymbolAddressInProcess 2.patch (12850 bytes, text/plain): Alternative fix for getSymbolAddressInProcess on Windows

Quuxplusone commented 7 years ago

Another attempt to fix the getSymbolAddressInProcess mangling issue, this time making the caller handle the demangling.

Thanks again! :)

I don't know if you got my message on IRC, but you should be aware that if we take this approach, there's going to be some inconsistency in the semantics of the various RTDyldMemoryManager methods

I'm afraid I missed it. That's ok: We should just document the inconsistency, or perhaps deprecate getSymbolAddressInProcess and create a new free function instead.

I wasn't sure where best to put the demangling code, so for now I've added it to the Mangler class.

I don't have strong feelings here. The Mangler class is the right place to put this if demangling is unambiguous, and would be useful to other people. Otherwise we can introduce a JITDemangler class.

Rafael - I think you did the most recent work on the Mangler, do you have any thoughts?

Also not sure what the best practice is for string types in this situation, but I went with StringRefs for both the input and the return value. This seemed like it would be most efficient and easy to use, but I suspect there could be some situations where it could be unsafe if called with a temporary.

So long as demangling just involves dropping a prefix/suffix (which in practice I think is always true) I think StringRef is fine. I'll just add documentation to the functions explaining the lifetime guarantees.

Also haven't been able to test all the Kaleidoscope examples because the existing code is already a bit broken. Will file a separate bug report for those issues.

Did you file a bug for that? I'd be happy to look into the issue.

Finally, I wasn't sure what to do with the LookupWithMangledAndDemangledSymbol unit test since most of what it was testing no longer applies. For now I've updated it so that the test should pass, but we may just want to get rid of it altogether.

Yes - I think this can go away if/when your changes go in.

Pending Rafael's comments (and me adding some comments) I think this looks great. Thanks James.

Quuxplusone commented 7 years ago
> > Also haven't been able to test all the Kaleidoscope examples
> Did you file a bug for that? I'd be happy to look into the issue.

The main problem I had with Kaleidoscope was that nothing worked on Windows,
but there was already a bug raised for that (25493), and my proposed solution
was the same as the fix you committed and then reverted in January (rev
258665). Do you have a specific plan in mind for fixing that still, or would
you be interested in alternative proposals?

Another issue I had was with the Remote-JIT example not working on Linux (it's
not supported at all on Windows, but that is a known limitation I think). I
raised bug 30417 for the Linux issue along with a proposed patch.

I think there was at least one other problem with the Windows code, but until
bug 25493 is addressed there didn't seem much point in trying to fix anything
else.

I also encountered an issue with the interpreter's address resolution which I
raised as bug 30584 along with a proposed patch. Not sure if that is also your
department.
Quuxplusone commented 6 years ago

I stumbled upon this bug in a simple scenario:

On Windows i386 I compiled a simple C program with -S -emit-llvm and piped it directly into lli.exe. This resulted in errors like

LLVM ERROR: Program used external function '_puts' which could not be resolved!

or

LLVM ERROR: Program used external function '___stdio_common_vsprintf' which could not be resolved!

In other words, all symbols were prefixed with additional underscore. However, applying a patch from this PR fixed it. Hoping this will be committed soon.