zhangshipei / xuggle

Automatically exported from code.google.com/p/xuggle
0 stars 0 forks source link

Xuggler causes deadlock on native libraries loading #244

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
From JVM helpful devs:

"
I just realized this is the issue you posted back in April - that was the one I 
was remembering, that and Xuggle. Based on this:

http://lists.apple.com/archives/java-dev/2009/Jun/msg00395.html

I'd say Xuggle is the prime suspect here.
"

More info:

"
Yes (the crash was the VM detecting the deadlock itself). Here's the stack as 
reported above:

1 libclient.dylib 4716923 SafepointSynchronize::block (JavaThread*) + 619
2 libclient.dylib 6027802 jni_NewStringUTF + 394
3 libxuggle-ferry.3.dylib 0x1aceb8c3 com::xuggle::ferry::Logger::getLogger(char 
const*) + 147
4 libxuggle-ferry.3.dylib 0x1acebedd 
com::xuggle::ferry::Logger::getStaticLogger(char const*) + 29
5 libxuggle-xuggler-io.3.dylib 0x171e6fbc 
__static_initialization_and_destruction_0(int, int) + 44
6 dyld 0x8fe12f36 ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext 
const&) + 246
7 dyld 0x8fe0e7e3 ImageLoader::recursiveInitialization(ImageLoader::LinkContext 
const&, unsigned int) + 307
8 dyld 0x8fe0e8c9 ImageLoader::runInitializers(ImageLoader::LinkContext const&) 
+ 57
9 dyld 0x8fe02202 dyld::runInitializers (ImageLoader*) + 34
10 dyld 0x8fe0bbdd dlopen + 605
11 libSystem.B.dylib 0x916ac2c2 dlopen + 66
12 libclient.dylib 0x0060d6f1 JVM_LoadLibrary + 193
13 libjava.jnilib 0x00061d74 Java_java_lang_ClassLoader_00024NativeLibrary_load 
+ 87

We start in Java, try to load a native library and enter the dynamic linker. 
That in turn causes native code in libxuggle-xuggler-io.3 to run which tries to 
invoke Java code: com::xuggle::ferry::Logger::getStaticLogger

If this is done while holding a lock in the dynamic linker then we will 
deadlock if any other thread tries to acquire that lock and a safepoint is 
trigerred - which in the above case is done by the thread holding that lock.

Your example is not so obvious as we can't see the attempted call back into the 
VM from the linker.

Even if there were no lock involved, the VM is not reentrant and can't be used 
as depicted. All calls out to dlopen etc would have to be handled as JNI calls 
(I'm not even sure that would be enough), but as dlopen might be called through 
some other native library call, all calls to the OS or libc would have to be 
treated as JNI calls - and that would be a huge overhead on the VM.
"

And update from Art:

"
Looking at the thread in more detail it's asking can the following happen with 
Xuggler:

java -> call into Xuggler -> call into Java

during the first library load.  I thought I had fixed this bug, but alas it 
looks like not for the IO library (the other Xuggler libraries have it fixed).  
The basic problem here is calling logging methods from native code during the 
OnLoad handler.  The fix is non trivial, but I'll try to do it.
.
.
.
And in a fit of madness, I went ahead and created a fix for the issue I saw.  I 
don't know if it fixes your issue, but revision r1041 might help:
http://code.google.com/p/xuggle/source/detail?r=1041

Bonus cookies points to anyone who can figure out why that change mattered
"

Original issue reported on code.google.com by Stas.Os...@gmail.com on 13 Jun 2010 at 7:23

GoogleCodeExporter commented 9 years ago
Sorry for bad formatting, might be better split to comments.

Please move this issue to critical level.

Original comment by Stas.Os...@gmail.com on 13 Jun 2010 at 7:24

GoogleCodeExporter commented 9 years ago
Stas, can you provide a new stack trace for the issue you're now seeing and let 
me know if you're using tip of tree.

Original comment by art.cla...@gmail.com on 13 Jun 2010 at 4:37

GoogleCodeExporter commented 9 years ago
I've just opened it for tracking following your request.

Will update it if/when I get the deadlock again.

Original comment by Stas.Os...@gmail.com on 13 Jun 2010 at 6:41

GoogleCodeExporter commented 9 years ago
Well then, I believe I fixed it :)

Original comment by art.cla...@gmail.com on 14 Jun 2010 at 1:53

GoogleCodeExporter commented 9 years ago
I wish so but:

1) Please take a look at what JVM dev had to say.
2) It usually happens once per week, so it's a bit early to say :).

Regards.

Original comment by Stas.Os...@gmail.com on 14 Jun 2010 at 2:09

GoogleCodeExporter commented 9 years ago
I saw that, but until you repro I'm leaving it closed.  :)

Original comment by art.cla...@gmail.com on 14 Jun 2010 at 2:36

GoogleCodeExporter commented 9 years ago
Issue reproduced again - GDB log attached.
It seems to have hanged around same area.

Original comment by Stas.Os...@gmail.com on 18 Jun 2010 at 10:33

Attachments: