keeleysam / tenfourfox

Automatically exported from code.google.com/p/tenfourfox
0 stars 0 forks source link

Tracejit runs out of stack space #37

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
!!!!STOP!!!!  !!!!READ THIS FIRST!!!!

First, make sure you are using the most current version: visit
www.tenfourfox.com to verify.

During the public testing phase, only LIMITED bug reports will be accepted.
This is to reduce the worklist down to the most important items. IF YOU
DON'T READ AND FOLLOW THE BELOW INSTRUCTIONS (and remove this blurb from
your report), YOUR BUG REPORT WILL BE CLOSED!

While 10.4Fx is in source-parity stage (see the Wiki page on
SupportLevels), we build almost directly from Mozilla hg except for those
parts required for compatibility. That means it is imperative that you
verify THE BUG IS IN 10.4Fx SPECIFICALLY. It is quite possible that the bug
is in Firefox, and we don't track those bugs -- Mozilla does.

Except for STARTUP FAILURE, you MUST TEST your bug in Firefox 4 on one
of the tier 1 platforms (Windows XP+, OS X 10.5+ i386 or Linux). If your
bug occurs there too, DO NOT SUBMIT IT TO US: submit it to Mozilla. Bonus
points for testing on Mac specifically, although comparisons with non-Mac
tier 1 builds will be accepted as bug reports.

If your bug occurs ONLY on 10.4Fx, continue with this report.

STARTUP CRASHES are always accepted, especially until we get our own crash
report infrastructure running. Make sure you fill out all the fields below.

-- 8< CUT EVERYTHING UP TO AND INCLUDING THIS LINE FROM YOUR REPORT 8< --

* I read everything above and have demonstrated this bug only occurs on
10.4Fx by testing against this official version of Firefox 4 (not
applicable for startup failure) - specify:

* This is a startup crash or failure to start (Y/N): N

* What steps are necessary to reproduce the bug? These must be reasonably
reliable.

Go to http://www.macintoshgarden.org/apps/c
type in "German" into the search field, press "Enter"
a result appears, the beachball appears, the app crashes

* Describe your processor, computer, operating system and any special
things about your environment.

Powerbook G4 1,67 GHz, 1,5 GB RAM, 10.5.8,

* If this is a startup crash or failure to start, have you tried restarting
with a clean profile? Does this fix the problem?

* For crashes or startup failure (optional): paste in any information from
Crash Reporter, or if you are able to run 10.4Fx in gdb, paste in the
backtrace. You can often do it like this from within Terminal.app:

cd appname.app/Contents/MacOS
gdb firefox-bin
run
*crash the app
bt full

* Include any useful additional information, steps you have tried, etc.:

Thanks for your help!

Original issue reported on code.google.com by b.mar...@gmail.com on 13 Mar 2011 at 9:12

Attachments:

GoogleCodeExporter commented 9 years ago
*Please* follow the directions on bug reports -- it takes me longer because 
then I have to dust off the Intel Mac and try Firefox 4 on it. It is helpful to 
have the user compare them because you know your exact STRs and setup.

The dump indicates that this bug is possibly separate from the tracer (it 
crashes on G5 as well). It is masked with an add-blocker, so the Google ad 
seems at fault. Marking severity high as there is a functional workaround. 
Firefox 4b12 and 4rc1 on the C2D mini do not crash.

Original comment by classi...@floodgap.com on 13 Mar 2011 at 10:02

GoogleCodeExporter commented 9 years ago
Oddly, the crash signature in gdb does not match the reporter's -- it looks 
like a tracer crash. Debug build is in progress.

Original comment by classi...@floodgap.com on 13 Mar 2011 at 10:03

GoogleCodeExporter commented 9 years ago
The problem turned out to be a subtle one and may have been the real cause of 
issue 31. Adobe, for some reason, wrote the prologue patch entry to be 
somewhere actually within the prologue, which meant the registers got 
clobbered. Since most traces are non-reentrant, this didn't make much 
difference most of the time, but this trace gets called from another trace. 
When I moved the patch entry down past the actual prologue to the meat of the 
function (since this is a branch, not a function call), the provided example no 
longer crashes, and the patch for issue 31 is no longer needed either.

Original comment by classi...@floodgap.com on 14 Mar 2011 at 3:00

GoogleCodeExporter commented 9 years ago
changing summary for the record ...

Original comment by classi...@floodgap.com on 14 Mar 2011 at 3:01

GoogleCodeExporter commented 9 years ago
Issue 41 has been merged into this issue.

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:50

GoogleCodeExporter commented 9 years ago
This was not the solution, but it helped to demonstrate the problem. The 
problem is that we can't do JSOP_CALL correctly with the nanojit. This is also 
the same problem as issue 31.

Working patch is to back out the change with the patch entry and blacklist 
JSOP_CALL in the tracer. This works. I'm trying to find a better way.

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:52

GoogleCodeExporter commented 9 years ago
Issue 40 has been merged into this issue.

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:53

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:53

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:53

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 22 Mar 2011 at 12:54

GoogleCodeExporter commented 9 years ago
After some analysis of Tony's example in issue 41, the issue is more insidious: 
this actually represents a problem with the prologue setting up ANY call. Some 
native calls are not problematic, but some (such as getPropertyByIndex, etc.), 
are. These calls are now blacklisted in the tracer. We'll just have to root 
them out in 4.0 pending a proper fix in 5.0, which as the fallout from this bug 
shows, is too risky in a stable branch. Edwin suggested the problem might be 
insufficient dynamic stack space and that seems most logical given what the 
original repair fixed.

Original comment by classi...@floodgap.com on 22 Mar 2011 at 3:22

GoogleCodeExporter commented 9 years ago
The JSOP_CALL brake needs to be revised. Too many people are complaining it 
hits their numbers significantly worse than my internal measurements. I'm not 
sure how much is hyperbole, but some do have plausible stats, so we want this. 
Moving up to Critical.

Original comment by classi...@floodgap.com on 23 Mar 2011 at 2:01

GoogleCodeExporter commented 9 years ago
JSOP_CALL break moved down to the slow patch in functionCall. Much better! 
Shipped in 4.0s.

Original comment by classi...@floodgap.com on 25 Mar 2011 at 4:01

GoogleCodeExporter commented 9 years ago
Reduced to high after the fix to issue 23. We'll deal with this again in 5.

Original comment by classi...@floodgap.com on 13 Apr 2011 at 1:18

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 16 May 2011 at 12:59

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 16 May 2011 at 1:29

GoogleCodeExporter commented 9 years ago
Issue 62 has been merged into this issue.

Original comment by classi...@floodgap.com on 17 May 2011 at 4:38

GoogleCodeExporter commented 9 years ago
chtrusch provided the reliable test case in issue 62 that allowed us to stamp 
out this error once and for all. The problem is ulimit. You can fix the problem 
yourself by, in a shell (it has to be bash),

$ ulimit -s 65536
$ TenFourFoxXXX.app/Contents/MacOS/firefox-bin

Now chtrusch's Amazon test case no longer bombs. In fact, when I take out all 
of the trace blocks, we still don't crash.

We run out of stack with the default allocation of 8192. I want to build a new 
test release with ulimit hard-set with ld and that will be a new 4.0.2pre. In 
the meantime, see if this fixes it for people.

Original comment by classi...@floodgap.com on 17 May 2011 at 4:41

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 17 May 2011 at 4:41

GoogleCodeExporter commented 9 years ago
32MB stack (ulimit -s 32768) also seems okay. 16384 is not enough to run 
GameBoy (issue 31), so we will build with 32MB stack.

Original comment by classi...@floodgap.com on 17 May 2011 at 4:53

GoogleCodeExporter commented 9 years ago
Congratulations! I can't modify this myself because I'm completely illiterate 
in the terminal (I get "invalid argument" and "operation not permitted"), and I 
really don't want to mess around with sudo (not knowing what I'm actually 
doing), so I'll just wait for the next 4.0.2pre build. 

Original comment by chtru...@web.de on 17 May 2011 at 6:11

GoogleCodeExporter commented 9 years ago
Actually, if at all possible, please try it and see if it works. Step by step:

Start Terminal
In the terminal window, type

bash
ulimit -s 32768
/Applications/TenFourFox7450.app/Contents/MacOS/firefox-bin

The app will start in the background. Pull it foreground and try to crash it.

Original comment by classi...@floodgap.com on 17 May 2011 at 6:33

GoogleCodeExporter commented 9 years ago
And you should not need to sudo for any of this, btw. I didn't on my 10.4.11 
systems.

Original comment by classi...@floodgap.com on 17 May 2011 at 6:34

GoogleCodeExporter commented 9 years ago
I can confirm this fix on G3 and G4 7450. Thanks for teaching me something 
about the Terminal and for making me spend more money for useless things on 
Amazon.

Original comment by chtru...@web.de on 17 May 2011 at 7:05

GoogleCodeExporter commented 9 years ago
What are the implications for all of this? More stack space ->fewer blacklisted 
calls/blocked traces ->better JS performance?

Original comment by chtru...@web.de on 17 May 2011 at 7:31

GoogleCodeExporter commented 9 years ago
Yup. The SunSpider and Dromaeo numbers are unlikely to move much because this 
bug doesn't really affect them, but it will help edge cases and some complex 
code will improve. And I just noticed some stuff Amazon has on sale ...

I'm having a little trouble getting this to stick in the build system; libtool 
doesn't like the linker flags. It's just a matter of finding the magic words in 
mozconfig though.

Original comment by classi...@floodgap.com on 17 May 2011 at 12:38

GoogleCodeExporter commented 9 years ago
Looks like Mozilla had a similar issue with official binaries; their fix is 
descended from https://bugzilla.mozilla.org/show_bug.cgi?id=549143 . Let's try 
that.

Original comment by classi...@floodgap.com on 17 May 2011 at 5:36

GoogleCodeExporter commented 9 years ago
Finally got it to work for the js shell. And v8 really, really like this. On 
the G5 quad 2.5GHz in Highest performance,

% ../../../obj-ff-dbg/dist/TenFourFox.app/Contents/MacOS/js -j run.js # with 
this fix
Richards: 1748
DeltaBlue: 1293
Crypto: 915
RayTrace: 350
EarleyBoyer: 519
----
Score: 822
% /Applications/TenFourFoxG5.app/Contents/MacOS/js -j run.js # 4.0.2pre 
"original"
Richards: 1671
DeltaBlue: 487
Crypto: 909
RayTrace: 350
EarleyBoyer: 468
----
Score: 656

That's quite a jump. I ran it twice just to make sure. DeltaBlue in particular 
has a massive improvement. Now to get the main browser working.

Original comment by classi...@floodgap.com on 17 May 2011 at 8:31

GoogleCodeExporter commented 9 years ago
Browser is now working (both chrome and content even though I don't support 
chrome acceleration in 4). As one would expect, since V8 really benefits from 
this and V8 is part of Dromaeo, Dromaeo also shoots up -- to 110 runs/sec from 
93 runs/sec.

I'll run off revised builds tonight and tomorrow and hopefully have a new 
4.0.2pre out shortly.

I think we can consider this bug closed!!

Original comment by classi...@floodgap.com on 18 May 2011 at 12:55

GoogleCodeExporter commented 9 years ago
Excellent. I worked with ulimit -s 32768 yesterday for several hours and didn't 
encounter any problems.

Original comment by chtru...@web.de on 18 May 2011 at 10:45