infinity0 / mozilla-gnome-keyring-legacy

A firefox extension that enables Gnome Keyring integration (legacy version)
https://bugzilla.mozilla.org/show_bug.cgi?id=309807
Other
55 stars 8 forks source link

xpcom_abi segfaults with Firefox >= 13 #12

Closed fat-lobyte closed 11 years ago

fat-lobyte commented 12 years ago

Hi, I was trying to build the extension on Ubuntu 12.04 with Firefox 13 from the firefox-next repository. The exact version is 13.0+build1-0ubuntu0.12.04.1

When the xpcom_abi executable is run, there is a segmentation fault and naturally the platform can't be determined:

$ ./xpcom_abi
Segmentation fault (core dumped)

This is the stacktrace from GDB:

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff6300c17 in PL_DHashTableOperate (table=0x62a3d8, 
    key=0x7ffff694fc80, op=PL_DHASH_LOOKUP)
    at /build/buildd/firefox-13.0+build1/build-tree/mozilla/obj-x86_64-linux-gnu/xpcom/build/pldhash.cpp:612
#2  0x00007ffff63260c6 in GetEntry (aKey=..., this=<optimized out>)
    at ../../dist/include/nsTHashtable.h:170
#3  nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Get (
    this=<optimized out>, aKey=...) at ../../dist/include/nsBaseHashtable.h:148
#4  0x00007ffff632631b in nsComponentManagerImpl::RegisterCIDEntry (
    this=0x62a370, aEntry=0x7ffff72f4f10, aModule=0x64d530)
    at /build/buildd/firefox-13.0+build1/build-tree/mozilla/xpcom/components/nsComponentManager.cpp:456
#5  0x00007ffff6327455 in nsComponentManagerImpl::RegisterModule (
    this=0x62a370, aModule=0x7ffff7122ed0, aFile=<optimized out>)
    at /build/buildd/firefox-13.0+build1/build-tree/mozilla/xpcom/components/nsComponentManager.cpp:430
#6  0x00007ffff6327f61 in nsComponentManagerImpl::Init (this=0x62a370)
    at /build/buildd/firefox-13.0+build1/build-tree/mozilla/xpcom/components/nsComponentManager.cpp:380
#7  0x00007ffff6303dd1 in NS_InitXPCOM2_P (result=0x7fffffffe2e0, 
    binDirectory=<optimized out>, appFileLocationProvider=0x0)
    at /build/buildd/firefox-13.0+build1/build-tree/mozilla/xpcom/build/nsXPComI---Type <return> to continue, or q <return> to quit---
nit.cpp:490
#8  0x0000000000401161 in main (argc=1, argv=0x7fffffffe408)
    at xpcom_abi.cpp:24

I am running this from a virtual machine, but this should make any difference, should it? Also note that this firefox version is not yet in official ubuntu repositories, but it will probably migrate soon.

Do you have any Ideas about this?

infinity0 commented 12 years ago

are you sure you both built and ran against firefox 13? also does the extension itself work, when you override PLATFORM=Linux_x86_64-gcc3 manually?

fat-lobyte commented 12 years ago

are you sure you both built and ran against firefox 13?

Yes, both dpkg and the firefox about window show firefox 13.

also does the extension itself work, when you override PLATFORM=Linux_x86_64-gcc3 manually?

Yes, it works

infinity0 commented 12 years ago

OK, not too important then. :P I will take a look at some point but I have other things to do atm :p

BTW, when https://bugzilla.mozilla.org/728600 is fixed we won't need this hack at all - I suggest you CC yourself to it to maybe give it more visibility among devs.

infinity0 commented 12 years ago

alternatively, we might be able to do something stupid like this:

$ strings /usr/lib/xulrunner-10.0/libxul.so | grep gcc3 Linux_x86_64-gcc3

fat-lobyte commented 12 years ago

alternatively, we might be able to do something stupid like this:

That's not very sexy, and we run into troubles the moment the location of libxul changes, which is every 6 weeks and between every distribution.

BTW, when https://bugzilla.mozilla.org/728600 is fixed we won't need this hack at all - I suggest you CC yourself to it to maybe give it more visibility among devs.

Yeah, Mozilla guys are _known_ for fixing user-invisible bugs in the codebase real fast. Well, let's hope for the best. Are you absolutely sure there's nothing wrong with the xpcom_abi code?
infinity0 commented 12 years ago

we run into troubles the moment the location of libxul changes,

we have already solved this problem

I haven't tested the xpcom_abi code against xulrunner 13 yet. But no, I don't know what's wrong with it, I put it together from a bunch of sources and don't fully understand how it works.

fat-lobyte commented 12 years ago

Fedora just got the update to Firefox 13, and I get the very same segfault.

Here's the stack trace, but it's pretty much the same.

(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00007ffff72df833 in PL_DHashTableOperate (table=0x62a298, key=0x7ffff763f630, op=PL_DHASH_LOOKUP) at /usr/src/debug/xulrunner-13.0/mozilla-release/objdir/xpcom/build/pldhash.cpp:612
#2  0x00007ffff7303eda in GetEntry (aKey=..., this=<optimized out>) at ../../dist/include/nsTHashtable.h:170
#3  nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Get (this=this@entry=0x62a298, aKey=...) at ../../dist/include/nsBaseHashtable.h:148
#4  0x00007ffff730421f in nsComponentManagerImpl::RegisterCIDEntry (this=this@entry=0x62a230, aEntry=aEntry@entry=0x7ffff7f57bb0, aModule=0x64d3f0)
    at /usr/src/debug/xulrunner-13.0/mozilla-release/xpcom/components/nsComponentManager.cpp:456
#5  0x00007ffff73051b8 in nsComponentManagerImpl::RegisterModule (this=this@entry=0x62a230, aModule=0x7ffff7d85a70, aFile=aFile@entry=0x0)
    at /usr/src/debug/xulrunner-13.0/mozilla-release/xpcom/components/nsComponentManager.cpp:430
#6  0x00007ffff7305c4d in nsComponentManagerImpl::Init (this=0x62a230) at /usr/src/debug/xulrunner-13.0/mozilla-release/xpcom/components/nsComponentManager.cpp:380
#7  0x00007ffff72e2c69 in NS_InitXPCOM2_P (result=0x7fffffffdf60, binDirectory=0x0, appFileLocationProvider=0x0) at /usr/src/debug/xulrunner-13.0/mozilla-release/xpcom/build/nsXPComInit.cpp:490
#8  0x0000000000401229 in main (argc=1, argv=0x7fffffffe088) at xpcom_abi.cpp:24
infinity0 commented 12 years ago

13.0-1 just hit debian experimental too. I will take a look when I get some time, maybe tomorrow.

fat-lobyte commented 12 years ago

FYI, the problem is that NS_InitXPCOM2 Segfaults instead of just failing. This is a minimal example that reproduces the error:

#include "nsIXULRuntime.h"

int main(int argc, char **argv) {
    nsresult rv;

    rv = NS_InitXPCOM2(nsnull, nsnull, nsnull); // <---- Segfault here
    if (!NS_SUCCEEDED(rv)) return rv;

    rv = NS_ShutdownXPCOM(nsnull);
    return 0;
}

Documentation says that the second parameter nsIFile* aBinDirectory is:

[in] The directory containing the component registry and runtime libraries. Pass null to specify that the current working directory should be used.

What exactly are "component registry and runtime libraries"? Maybe we don't have any such files and the treatment of this case has changed.

Could this be a xulrunner bug after all? Do you know by any chance: is this function is still used? Or could it be deprecated?

fat-lobyte commented 12 years ago

is this function is still used? Or could it be deprecated?

Ok, with 74 occurances in the Firefox release tarball, that's probably not the case.

infinity0 commented 12 years ago

I have got the source code and will have a play around with it, but I really hate debugging crap like this. The "strings grep gcc3" option is sounding better and better... I will ask around on mozilla IRC first though.

infinity0 commented 12 years ago

This issue appears to be caused by mozilla::HashBytes which is new in xulrunner 13.

$ LD_LIBRARY_PATH=/usr/lib/xulrunner-13.0 gdb ./xpcom_abi
GNU gdb (GDB) 7.4.1-debian
[...]
Reading symbols from /home/infinity0/org/package/mozilla-gnome-keyring/xpcom_abi...done.
(gdb) b nsComponentManagerImpl::RegisterCIDEntry
Function "nsComponentManagerImpl::RegisterCIDEntry" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (nsComponentManagerImpl::RegisterCIDEntry) pending.
(gdb) run
Starting program: /home/infinity0/org/package/mozilla-gnome-keyring/xpcom_abi 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffebc09700 (LWP 7137)]
[New Thread 0x7fffeb032700 (LWP 7138)]

Breakpoint 1, nsComponentManagerImpl::RegisterCIDEntry (this=this@entry=0x631cf0, aEntry=aEntry@entry=0x7ffff7d3d620, aModule=0x655290)
    at /tmp/buildd/iceweasel-13.0/xpcom/components/nsComponentManager.cpp:453
453 /tmp/buildd/iceweasel-13.0/xpcom/components/nsComponentManager.cpp: No such file or directory.
(gdb) p *aEntry->cid
$1 = {m0 = 2440519008, m1 = 54748, m2 = 4562, m3 = "\222\373\000\340\230\005W\017"}
(gdb) b mozilla::HashBytes
Breakpoint 2 at 0x7ffff68248d0
(gdb) continue
Continuing.

Breakpoint 2, 0x00007ffff68248d0 in mozilla::HashBytes () from /usr/lib/xulrunner-13.0/libxul.so

symbols not available for some reason, have to look at registers:

(gdb) info registers
rax            0x7ffff7d9c650   140737351632464
rbx            0x631d58 6495576
rcx            0x1  1
rdx            0x0  0
rsi            0x10 16
rdi            0x7ffff74558e0   140737341905120
rbp            0x0  0x0
rsp            0x7fffffffdfa8   0x7fffffffdfa8
r8             0x7ffff4c4cec8   140737299926728
r9             0x7ffff494c86a   140737296779370
r10            0x7ffff494c86a   140737296779370
r11            0x7ffff494c86a   140737296779370
r12            0x655290 6640272
r13            0x7ffff74558e0   140737341905120
r14            0x0  0
r15            0x0  0
rip            0x7ffff68248d0   0x7ffff68248d0 <_ZN7mozilla9HashBytesEPKvm@plt>
eflags         0x246    [ PF ZF IF ]
cs             0x33 51
ss             0x2b 43
ds             0x0  0
es             0x0  0
fs             0x0  0
gs             0x0  0

HashFunctions.h:346: HashBytes(const void* bytes, size_t length);

rdi is bytes, rsi is length

(gdb) x/16xb 0x7ffff74558e0
0x7ffff74558e0 <_ZL20kComponentManagerCID>: 0x60    0x5d    0x77    0x91    0xdc    0xd5    0xd2    0x11
0x7ffff74558e8 <_ZL20kComponentManagerCID+8>:   0x92    0xfb    0x00    0xe0    0x98    0x05    0x57    0x0f

This is the same as the CID from the first part.

Let's test out this theory by making a call to HashBytes before anything else:

(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) b main
Breakpoint 3 at 0x401287: file xpcom_abi.cpp, line 23.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/infinity0/org/package/mozilla-gnome-keyring/xpcom_abi 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 3, main (argc=1, argv=0x7fffffffe308) at xpcom_abi.cpp:23
23      nsCOMPtr<nsIServiceManager> servMan;
(gdb) p 'mozilla::HashBytes'("\x60\x5d\x77\x91\xdc\xd5\xd2\x11\x92\xfb\x00\xe0\x98\x05\x57\x0f", 16)

Breakpoint 2, 0x00007ffff68248d0 in mozilla::HashBytes () from /usr/lib/xulrunner-13.0/libxul.so
The program being debugged stopped while in a function called from GDB.
Evaluation of the expression containing the function
(_ZN7mozilla9HashBytesEPKvm@plt) will be abandoned.
When the function is done executing, GDB will silently stop.
(gdb) continue
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()

I'll go file a bug report later, going out now.

infinity0 commented 12 years ago

in the meantime feel free to try to debug http://hg.mozilla.org/mozilla-central/annotate/61447dccb529/mfbt/HashFunctions.cpp to see why it's causing a segfault with that input.

infinity0 commented 12 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=763327

fat-lobyte commented 12 years ago

Awesome! Thanks for your work. Copying HashFunctions.cpp to the source tree and linking it to xpcom_abi makes stuff work.

This is a good enough workaround for me, I think I'll ship it in the Ubuntu build until they fix the bug in xulrunner.

fat-lobyte commented 12 years ago

Bug still present in Firefox 14.

fat-lobyte commented 12 years ago

Although this is a Firefox bug, maybe we should add the HashFunction.cpp to the repository. What do you think? Would that be a good idea?

infinity0 commented 12 years ago

I'm undecided... it would be simple, but this is clearly mozilla's fault, we shouldn't have to do anything. :/

It might be simple to fix the bug in mozilla itself actually, I could take a look some time.

infinity0 commented 11 years ago

Fixed in d6c9ee941c988f0962e9af7150b6037a1e03aec1 and d87f35427349808c8a2f03a2005e2bc9838894e6.

megaloni commented 11 years ago

Just in case this is still useful to anyone, adding the following to my linker command fixed the issue for me:

-Wl,--whole-archive -lmozglue -lmemory

fat-lobyte commented 11 years ago

Hi! Yes, this is useful, thank you. Although I had to modify your flags to:

-Wl,-whole-archive -lmozglue -Wl,-no-whole-archive