keeleysam / tenfourfox

Automatically exported from code.google.com/p/tenfourfox
0 stars 0 forks source link

Compiler performance changes for 13 #153

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The attached patch makes (nearly) all sources to be built as code suitable for 
an application, including XUL which is a dynamic library (but is linked 
position dependent anyway). But since XUL is used only by firefox and firefox 
doesn't allow multiple instances of it to be running that shouldn't harm.

So that should speed up things quite a bit.

I already had firefox 14.0 (unoptimized build) running well with it.

Original issue reported on code.google.com by Tobias.N...@gmail.com on 19 May 2012 at 11:10

Attachments:

GoogleCodeExporter commented 9 years ago
Looks straightforward. I'll work it into 13 final.

Original comment by classi...@floodgap.com on 19 May 2012 at 4:52

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 19 May 2012 at 4:52

GoogleCodeExporter commented 9 years ago
Aurora 14.0a2 is up and running, built with "-mdynamic-no-pic". At optimization 
level 3 and without any special CPU model specific tuning (meaning G3) on my G4 
7450 this runs the sunspider test at least as fast as TFF 12.
I have the impression that it launches a bit faster; at least session restoring 
is faster.

Original comment by Tobias.N...@gmail.com on 19 May 2012 at 8:46

GoogleCodeExporter commented 9 years ago
I'm also going to add -minsert-sched-nops=regroup_exact to the G5 mozconfig 
because it looks like -mtune=G5 doesn't enable this and G5 really needs it. 
I'll probably port this particular change to stable if it is successful since 
it is pretty benign and doesn't change anything other than code size, but we 
need to watch the linker.

Original comment by classi...@floodgap.com on 19 May 2012 at 10:18

GoogleCodeExporter commented 9 years ago
(ref.: http://gcc.gnu.org/ml/gcc/2003-11/msg01302.html )

Original comment by classi...@floodgap.com on 19 May 2012 at 10:22

GoogleCodeExporter commented 9 years ago
Tobias, your initial results are no fluke. I've gotten a 305 in PK when I 
wasn't expecting it to pass. Can you confirm if its gcc46 or 47? O3 seems to 
make a substantial difference. Bravo!

Original comment by spm...@gmail.com on 19 May 2012 at 10:37

GoogleCodeExporter commented 9 years ago
Also how can it be G3 optimized on a non-G3 platform (10.5)?

Original comment by spm...@gmail.com on 20 May 2012 at 12:23

GoogleCodeExporter commented 9 years ago
-O3 != -G3

Original comment by classi...@floodgap.com on 20 May 2012 at 3:32

GoogleCodeExporter commented 9 years ago
Although this isn't the topic of this issue:
I built with the standard 10.5 toolchain, using gcc 4.2 (TFF is built using 
4.0). O3 is same the optimization level used in TFF. "dynamic-no-pic" should 
bring some performance improvement. Not specifying any special CPU to optimize 
for makes gcc generate code optimized for the G3 (which is also pretty 
identical to G4 7400 because they share a very similar core).

Original comment by Tobias.N...@gmail.com on 20 May 2012 at 7:23

GoogleCodeExporter commented 9 years ago
-minsert-sched-nops=regroup_exact worsened benchmarks. I did some more digging 
and it looks like -mtune does set it *and* uses a different value. So we're not 
going to use that. I'll test -mdynamic-no-pic next, but I'm going to put 
-mdynamic-no-pic -read_only_relocs suppress into the mozconfigs instead.

Original comment by classi...@floodgap.com on 20 May 2012 at 1:59

GoogleCodeExporter commented 9 years ago
(ref.: http://gcc.gnu.org/ml/gcc/2003-11/msg01308.html )

Original comment by classi...@floodgap.com on 20 May 2012 at 1:59

GoogleCodeExporter commented 9 years ago
I don't know if adding dynamic-no-pic to the mozconfig will work for all of the 
dynamic libraries that are built. If you could built whole firefox as one 
static application (which isn't possible anymore) it would certainly work.

Original comment by Tobias.N...@gmail.com on 20 May 2012 at 5:16

GoogleCodeExporter commented 9 years ago
The reason I did it that way is that -mdynamic-no-pic will automatically 
reverse any -fPIC (in fact, you get a warning that -mdynamic-no-pic suppresses 
-fPIC if Mozilla specifies it), and it reduces the number of changes we have to 
make to configure.

I'm using such a build now and the G5 definitely has better branch prediction 
since everything is absolute. It is noticibly snappier and had a marginal 
improvement on Peacekeeper, so we'll ship this for 13 final.

Original comment by classi...@floodgap.com on 21 May 2012 at 1:19

GoogleCodeExporter commented 9 years ago
Looking forward to benchmarking 13 with the new patch. On the subject of 
compiler changes, perhaps in Q1 all three channels could make a unified push to 
4.6 or 4.7?

Original comment by spm...@gmail.com on 25 May 2012 at 8:55

GoogleCodeExporter commented 9 years ago
You are unlikely to notice big benchmark changes because the benchmarks are 
disproportionately based on JavaScript and our JavaScript is run by the JIT, 
not by the compiler. This only affects the compiler. The change is roughly 
3-4%, which is below the noise threshold for Peacekeeper.

Where the difference will be is in overall browser operation, because there is 
less call indirection.

Marking Verified since we are shipping in 13; file followup issues or 
enhancements in new issues.

Original comment by classi...@floodgap.com on 30 May 2012 at 11:54