Closed GoogleCodeExporter closed 9 years ago
Ah, interesting.
patch_all_modules_lock is held by -- no surprise -- PatchAllModules, which is
called
at program startup and also called during LoadLibraryEx (we patch LoadLibraryEx
to
call PatchAllModules in order to patch the new module). If LoadLibraryEx
somehow
recursively called another LoadLibraryEx, or in fact just a call to
PatchAllModules
somehow resulted in a call to LoadLibraryEx, that would explain the deadlock
you're
seeing.
But that would be 100% reproducible, and your problem isn't. In fact, it looks
like
your problem is two threads are calling LoadLibraryEx at the same time. That
shouldn't cause deadlock though.
Can you report the stack-traces for the two threads during deadlock? You say
the
thread holding the lock is locked in GetProcAddress -- what is there before
that?
And there's nothing after the call into ntdll.dll?
The fact it freezes on user input is intersting. Maybe windows needs to load a
new
DLL then?
I'm suspicious that the thread holding the lock is frozen in a call to
GetProcAddress. I wonder if that is requiring you to load a new library, which
results in deadlock. I'd expect you to see that in the stacktrace, though.
I can rework this code to lessen the likelihood of deadlock, but I can't
promise to
eliminate it until I understand what's going on.
Original comment by csilv...@gmail.com
on 26 May 2009 at 5:28
I attach stack traces of the two threads, four at the point of the deadlock and
two
at the point of the locking.
It appears to me that
1. Thread 0 does some OLE registration, this causes the loading of a library in a
different thread "4".
2. Thread 4 starts loading the library and patching.
3. Thread 0 continues and runs into a LoadLibraryW. Conjecture: This locks some
internal ntdll.dll ressource.
4. Thread 4 calls GetProcAddress. Conjecture: This waits for the same ressource in
ntdll.dll
5. Thread 0 has loaded the library and wants to start patching. --> Deadlock
Possibly, the threads request two ressources (tcmalloc's and ntdll's) in
different order.
What is implausible in this explanation: When Thread 0 reaches patching, it
still
holds the ntdll resource, while when Thread 4 reaches patching it does not, or
at
least does not prevent Thread 0 from taking it.
Ideas for solutions (I don't know how to implement them or if they are valid):
1. request and keep ntdll's resource before/instaed tcmalloc's
2. make patching thread safe such that no locking is necessary
3. make the GetProcAdress-Part thread save and release tcmalloc's lock before
Is this helpful?
Do you have any idea, how to fix it from tcmalloc's side?
Or how to fix it from my side? Trying to pre-loading all DLLs that could be
possibly
loaded? Making a lock in qt that serialises the OLE and Loadlib actions?
Original comment by weidenri...@gmx.de
on 27 May 2009 at 3:10
Attachments:
Just noticed that the last sentence "Making a lock in qt that serialises the
OLE and
Loadlib actions?" is nonsense: The OLE action initiate the other threads and
exists.
Locking cannot help here.
Original comment by weidenri...@gmx.de
on 27 May 2009 at 3:23
Whew! I finally got a free moment to look at this.
This is tricky. The basic problem, it seems to me, is that thread 0 is calling
LoadLibrary recursively:
---
libtcmalloc_minimal.dll!`anonymous namespace'::PatchAllModules() Zeile 561 + 0xd C++
libtcmalloc_minimal.dll!`anonymous
namespace'::WindowsInfo::Perftools_LoadLibraryExW(const unsigned short *
lpFileName=0x7ffdfc00, void * hFile=0x00000000, unsigned long dwFlags=0) Zeile
875 C++
kernel32.dll!LoadLibraryExA() + 0x1f
kernel32.dll!LoadLibraryA() + 0x2d
usp10.dll!ScriptApplyDigitSubstitution() + 0x959d
usp10.dll!ScriptApplyDigitSubstitution() + 0x9772
usp10.dll!UspFreeMem() + 0x1ca2f
ntdll.dll!LdrInitializeThunk() + 0x24
ntdll.dll!LdrFindResourceDirectory_U() + 0x28d
ntdll.dll!RtlValidateUnicodeString() + 0x507
ntdll.dll!LdrLoadDll() + 0x110
kernel32.dll!LoadLibraryExW() + 0xc8
libtcmalloc_minimal.dll!`anonymous
namespace'::WindowsInfo::Perftools_LoadLibraryExW(const unsigned short *
lpFileName=0x00d7e920, void * hFile=0x00000000, unsigned long dwFlags=0) Zeile
873
+ 0x12 C++
kernel32.dll!LoadLibraryW() + 0x11
qt-mt338.dll!QLibraryPrivate::loadLibrary() Zeile 66 + 0x1a C++
---
This seems to be because usp10 is (like us) putting itself in the middle of
LoadLibraryExW. In this case, it wants to load another library in the middle
of the
existing LoadLibraryExW call. If LoadLibraryExW requires the os loader lock,
which a
cursory web search shows it does (cf
http://blogs.msdn.com/cbrumme/archive/2003/08/20/51504.aspx), than this
recursive
call would deadlock.
However, the recursive call itself isn't deadlocking. So maybe nt has special
code
to handle this, or the os loader lock is a recursive lock? In any case,
GetProcAddress also requires the loader lock, and that's where the lock
inversion
comes in.
I'm not totally sure I understand it, which is too bad because I don't know the
best
way to fix it. I guess the best thing to do is to try to pull the
GetProcAddress
call up, and do it outside the patch_all_functions lock. This is annoying, but
should be doable. I don't know if it will solve the problem or not, though. :-(
And I hate to contort the code this way unless I really have to.
As a temporary workaround, getting rid of patch_all_modules_lock is probably
fine.
You're not loading any modules containing libc, probably, so all of this
LoadLibraryExW stuff is a noop in practice anyway.
Original comment by csilv...@gmail.com
on 29 May 2009 at 7:51
Ok, I've reorganized patch_functions.cc in a way that could possibly fix this
program. Can you try applying the patch I've attached here, and see how it
works?
(It should apply cleanly, I hope.) I've barely tested it, but it seems to at
least
not blow up.
Original comment by csilv...@gmail.com
on 29 May 2009 at 10:02
Attachments:
The patch was applied with some "fuzz", but solved the problem. Do you have
plans for
a next release of tcmalloc?
A note about your interpretation: Isn't LoadLibraryW (the unicode version) just
calling usp10 to do the unicode->ascii conversion. And this calls some
LoadLibraryA
again (possibly for internal purposes). So, an explanation for the
implausibility
mentioned above ("When Thread 0 reaches patching, it still
holds the ntdll resource, while when Thread 4 reaches patching it does not.")
could
be: Thread 0 reaches patching from LoadLibraryA, but the LoadLibraryW is
holding the
lock.
Anyway, thanks for the fix!
Original comment by weidenri...@gmx.de
on 2 Jun 2009 at 8:42
Great, I'm glad the patch works! I'm looking to get a new release out this
month
sometime, but I don't know more details than that. This change will be in it.
Original comment by csilv...@gmail.com
on 2 Jun 2009 at 5:42
This should be fixed in perftools 1.3, just released.
Original comment by csilv...@gmail.com
on 10 Jun 2009 at 2:06
Unfortunately, I got a deadlock in the same situation again. :-(
It's with v1.4, but I cannot say that it has to do with the version. Currently,
I
cannot reproduce it in a way that I get a stack trace, but I keep trying.
Original comment by weidenri...@gmx.de
on 6 Oct 2009 at 12:29
Drat. I'll reopen the bug. If you can get a stacktrace, let me know, and I'll
see
if I can figure out what's going on.
Original comment by csilv...@gmail.com
on 13 Oct 2009 at 9:25
Finally I got a stack trace, it was the same situation, but probably a different
problem. So, please close this issue, I will open a new one.
Original comment by weidenri...@gmx.de
on 11 Nov 2009 at 6:21
Closing this issue, as requested. :-) Only 4 months late...
Original comment by csilv...@gmail.com
on 10 Mar 2010 at 6:07
Original issue reported on code.google.com by
weidenri...@gmx.de
on 26 May 2009 at 1:10