Closed GoogleCodeExporter closed 9 years ago
Actually, critical; we will ship the test beta with these.
Original comment by classi...@floodgap.com
on 5 Apr 2011 at 10:45
You also may want to try GCC 4.6 - it has significant improvements in
compatibility with Mac OS frameworks. Though it does not support Tiger.
GCC 4.2 could also be replaced with llvm-gcc 2.9 - it outperforms GCC 4.2 in
benchmarks.
Original comment by annu...@gmail.com
on 11 Apr 2011 at 10:06
4.6 is out precisely because it doesn't support Tiger. I use Tiger myself, so
that's pretty much the end of that.
Original comment by classi...@floodgap.com
on 11 Apr 2011 at 12:30
Between issue 23, issue 28 and issue 50, I think we have a little too much to
test in the next beta and this could introduce subtle bugs, so I'm going to
defer this to the one after and mark High.
Original comment by classi...@floodgap.com
on 11 Apr 2011 at 1:23
implemented for 4.0.2 in 7400, 7450 and G5
Original comment by classi...@floodgap.com
on 12 May 2011 at 5:14
First build with G5 crashed immediately. Dropped -ftree-vectorize, replaced
with -fivopts.
Original comment by classi...@floodgap.com
on 12 May 2011 at 12:59
Absolutely no change in benchmarks. Because this is a potentially risky change
with little gain, we're deferring until we have to use 4.2 and then we can see
if -ftree-vectorize works again.
Original comment by classi...@floodgap.com
on 12 May 2011 at 1:23
I researched a little bit more about the altivec related compiler flags and
found out the following:
Specifying an Altivec-capable CPU type does not (any longer?) automatically
enable "-faltivec" (which is more or less equivalent to "-mpim-altivec") but it
does just enable "-maltivec". "-faltivec" allows for access to Altivec
functions without the need of a special header file while with "-maltivec" a
Altivec header file is needed. "-faltivec" seems to be the old-fashioned
version and Carbon (which comes from classic Mac OS) needs that type. So one
always needs to pass "-faltivec" to gcc when using Carbon API calls. In any
other case access to Altivec should be possible without specifying "-faltivec".
But there are some other implications: when specifying a CPU type of ppc7400
together with "-faltivec" the CPU type is reverted to just ppc. That is not the
case for cpu types of ppc7450 or ppc970.
That behavior seems to have changed from Tiger gcc-4.0 (Apple build 5370) to
the more recent versions that come with Xcode 3.
Original comment by Tobias.N...@gmail.com
on 1 Jun 2011 at 10:09
> when specifying a CPU type of ppc7400 together with "-faltivec" the CPU type
is reverted to just ppc.
How did you notice this?
This sounds like a bit of a minefield, so maybe we should (while gcc-4.0 is the
officially supported compiler) leave the mozconfigs alone except for
--enable-*strip, and make a note in the build instructions.
Original comment by classi...@floodgap.com
on 1 Jun 2011 at 12:49
I changed the gcc parameters and looked at the architecture of the output file;
actually I should have said "architecture" instead of "cpu type". From the
manpage of gcc I get the impression that for gcc the only difference between
ppc750 and ppc7400 is the switch "-maltivec". Code compiled with that switch
enabled obviously can't run on G3 CPUs anymore. Hence it gets the architecture
"ppc7400". That's different with "-faltivec" where gcc isolates all functions
that access the vector unit so that the code itself can decide whether to use
an Altivec-optimized code path or not. Hence that code gets the architecture
"ppc" (but the code in that case can try to access the vector unit on a G3 CPU
what will lead to an exception). For ppc7450 or ppc970 that's different because
there are other optimizations apart from "-maltivec" switched on that only work
with these respective CPU types - while a G5 CPU should always be able to
execute code of the architecture ppc7450 as well. So switching on "-faltivec"
together with cpu type of ppc7450 or ppc970 doesn't make the code runnable on
G3 CPUs as well. Hence the compiled code still gets the architecture ppc7450 or
ppc970.
About supported compilers:
Is there really still any "support" for gcc-4.0 build 5370?
For me support would be providing later gcc versions built for 10.4 that
include bug fixes.
Apple doesn't provide any support any more for gcc-4.0 build 5370 in Tiger.
It's the same way unsupported in Tiger as later versions of gcc-4.0, gcc-4.2,
llvm-gcc-4.2 or clang.
Support for any compilers in Tiger is in fact up to us. There's almost no one
else any more - except for Gentoo Prefix. They in fact still do support
building of all those compilers, ld64 and cctools in Tiger.
Original comment by Tobias.N...@gmail.com
on 1 Jun 2011 at 6:21
Not supported by Apple -- supported by us. Since gcc-4.0(5370) is the compiler
that comes with Xcode 2.5, that's the compiler I want to maintain support for
out of the box. I'd like to support the linker, too, but since that won't work
we'll support "yours" -- but that means we also have to distribute it, etc. I'd
like to keep as many tools "standard" as possible until they don't work or
can't be made to work. There are some compiler bugs in 5370, but the patches
work around them.
For that reason I'm going to keep the mozconfigs otherwise the same (I did make
a tweak to DEBUG which will come with the beta changesets) except for the strip
options. I'm using one such strip build now on the G5 and 50.9MB is quite
acceptable for size. When gcc-4.2 becomes our minimum required compiler, and I
imagine it will sooner or later, then we'll revise our command line options.
Obviously gcc-4.2 will remain an option until then, but it will be experimental
for now, and the required changes will be in the build notes rather than in the
configs.
Original comment by classi...@floodgap.com
on 2 Jun 2011 at 6:00
(changing status back to accepted since we're not moving on this yet)
Original comment by classi...@floodgap.com
on 2 Jun 2011 at 6:05
Original comment by classi...@floodgap.com
on 3 Jun 2011 at 10:27
Did a build with llvm-gcc-4.2 based on llvm 2.9 .
The build did mainly work as well as with gcc but there were a few files which
couldn't be compiled when generating debug symbols "-g" was turned on. Those
files I manually compiled deleting "-g" from the command line.
Why do we build with debug symbols even the optimized builds?
Shouldn't it only be activated when using the method described on the following
web page? https://developer.mozilla.org/en/Building_Firefox_with_Debug_Symbols
With the resulting Aurora.app I ran dromaeo and it gave 39 runs/s on the G4 1,5
GHz vs. 53 runs/s with the same application compiled with gcc.
Did you notice that libffi isn't compiled with the compiler specified by
"CC"/"CXX" but with just "usr/bin/gcc"?
Original comment by Tobias.N...@gmail.com
on 5 Jun 2011 at 8:38
We build with debug symbols on even in opt builds because we don't have a crash
reporter system, and this gives me a chance to get debug information off other
people's systems.
Nice spot on the libffi, although I imagine that's a Mozilla bug.
That's rather disappointing that llvm does significantly worse, but I bet
people really aren't working on PPC optimization in it. IBM does and did donate
quite a bit of code to gcc.
Original comment by classi...@floodgap.com
on 5 Jun 2011 at 9:18
A question: on the topic of GCC 4.2. For those who have a Power Mac G5 with the
dual-core PowerPC 970FX (G5) cpu, are you:
1) auto-parallelizing (-ftree-parallelize-loops)?
2) using all cores (-fopenmp or -openmp)?
Just curious.
Original comment by jorgequi...@yahoo.com
on 11 Jun 2011 at 12:08
Those are certainly ones to consider when we have to make the jump, but I have
not tried them personally. Perhaps Tobias has.
Original comment by classi...@floodgap.com
on 11 Jun 2011 at 3:04
As I don't have any multi-cpu Mac I can't test a build with that options.
But I could very easily provide the tarball containing the gcc-4.2 installation
I built.
As I built it the Apple way it doesn't overwrite any files of the already
installed compilers; Apple modifies the install locations of gcc components to
make coexistence of various compiler versions possible.
So there shouldn't be any risk in installing that tarball.
Original comment by Tobias.N...@gmail.com
on 11 Jun 2011 at 8:00
I recently built gcc-4.6 on my 10.4 installation. There weren't any problems
during compilation and linking!
Building pure C applications works without problems.
In order to get C++ applications to be built against the (quite old) version of
libstdc++ I needed to modifiy some system headers but gcc provides an easy
method to do so without touching the original headers (fixincludes).
Compiling Objective-C/C++ also seems to work which is needed for interfaces the
Mac OS system frameworks. Compatibility with Mac OS frameworks is a new feature
in gcc-4.6 and is officially supported only for 10.5 and 10.6.
So finally I could get XUL compiled - but linking failed in the first attempt.
There is a linker (ld64 97.17) warning for which there is already a proposed
gcc fix available that I expect to appear in a future release of gcc. I applied
that patch to my gcc sources built it again but didn't have the time yet to
rebuild whole XUL. Just recompiling the files the linker warns about didn't
help.
I also noticed that in gcc-4.6 "-ftree-vectorize" is automatically switched on
by "-O3" so that's expected to work in that version.
Original comment by Tobias.N...@gmail.com
on 11 Jun 2011 at 8:16
Managed to build XUL with gcc-4.6 - and it seems to work!
Found out that "ftree-vectorize" has to be turned off at least for (some?)
ObjC++ files (maybe also ObjC files). Turning on "ftree-vectorize" for those
files results in linker errors. So for gcc-4.6 one has to downgrade
optimization level to 2. One can still manually turn on the other options O3
turns on.
For my first working XUL I compiled all ObjC++ files with O2 and all options
that O3 turns on except for ftree-vectorize. All other files were built with
O3. A first test of JavaScript performance gave no improvements over a gcc-4.2
compiled XUL on my 500 MHz G4 7400.
Original comment by Tobias.N...@gmail.com
on 11 Jun 2011 at 7:38
Nice! I'm not too worried about there being no performance improvement; I just
didn't want there to be a significant *drop*. So it's good to know that the
option is there if Mozilla stops supporting 4.0/4.2.
Original comment by classi...@floodgap.com
on 11 Jun 2011 at 9:25
Thanks for the feedback. When you finally decide to use GCC 4.2 or 4.6, it
would be interesting to see if taking advantage of parallel-processing compiler
options making any difference in performance. I've read that on devices that
have dual-core A9 ARM chips (like Tegra 2) and use Android OS (2.3 or higher)
can do this: they use one core for the browser and other core for the Flash,
for example. I wonder if this can be achieved on TenFourFox. It probably is
more complicated in reality. Autovectorization is really only part of the
equation: is the compiler you currently use making use of all VMX/Altivec cores
on the multicore/multiprocessors on G5 macs without the use of OpenMP or the
autoparallelization flags? This allow you to squeeze out the very last drop of
performance.
Original comment by jorgequi...@yahoo.com
on 11 Jun 2011 at 11:07
Well THAT's impressive!
I compiled whole TenFourFox 5 (based on that first preliminar patchset) with
gcc-4.6. For JavaScript and NSPR I set the optimization level to O3 (that means
"tree-vectorize" on) and the rest was built with O2 and every options O3
includes except for "tree-vectorize". I chose JavaScript (including libffi) and
NSPR for autovectorization because they are configured independently of the
rest.
And JavaScript performance (dromaeo) on the G4 1,5 GHz improved to 59,5 runs/s.
That's 10% faster! And the build was optimized for G4 7400 and not 7450 what
could result in additional improvements.
Maybe autovectorization for just NSPR and JavaScript also works with gcc-4.0 ?!
Additional improvements could be achieved by using LTO but I read that for
linking XUL with LTO switched on needs 8GB of RAM and my best machine has just
1.25GB.
Original comment by Tobias.N...@gmail.com
on 12 Jun 2011 at 10:46
Nice work. It might indeed. I'll schedule that for 6, though, since we're too
far into 5 (I want to do only one more beta cycle, and issue 68 is higher
priority).
Original comment by classi...@floodgap.com
on 12 Jun 2011 at 11:00
Did one final try adding link time optimization to the gcc flags for NSPR and
JavaScript.
JavaScript performance in dromaeo however wasn't affected.
So now after all my tests of different toolchain versions I ended up having
tarballs of gcc-4.0.1, gcc-4.2.1, gcc-4.6.0, llvm-gcc-4.2 (LLVM 2.9),
clang-2.9, ld64-97.17, cctools-800, cctools-782 (needed for compatibility with
LTO in gcc-4.6) ready for installation on stock Mac OS X 10.4.
I wonder whether I could/should provide those packages for download in some
place...
Original comment by to...@jesus.de
on 13 Jun 2011 at 7:51
Oh that's me, Tobias, being logged in with another account.
Original comment by to...@jesus.de
on 13 Jun 2011 at 7:52
LTO is pretty expensive anyway. The autovectorization should be the biggest
win. As I say, I'm dealing with some subtleties in 5's JS I don't want to
overcomplicate, but this seems a no-brainer for 6. I'll spin that off as a
separate issue when 5 goes out.
Original comment by classi...@floodgap.com
on 13 Jun 2011 at 7:58
All compilers and binary tool packages I built for 10.4 are now available as
tarballs from the Google code project "tiger-toolchains".
Original comment by Tobias.N...@gmail.com
on 17 Jun 2011 at 5:45
Nice! However, Google might want you to put some token source up, even if it's
just your patches ...
Original comment by classi...@floodgap.com
on 17 Jun 2011 at 10:57
Well, I'll wait until they ask me to do so ;-).
I tried autovectorization with TFF 5.0 and gcc-4.0.1 and found out some things.
GCC manpage says in the Apple versions tree-vectorize forces strict-aliasing.
There is however mozilla bug 414641 concerning strict-aliasing which isn't
closed yet.
Because of that behaviour we cannot compile NSPR with tree-vectorize because
NSPR has issues with strict-aliasing - at least turning on tree-vectorization
produces warnings saying that the strict-aliasing rules are being broken. That
same messages I got also for cairo, angle and growl modules.
But for JavaScript it seems to be fine - at least there are no warnings.
However the build isn't finished yet; I'll report if it's basically working.
Original comment by Tobias.N...@gmail.com
on 18 Jun 2011 at 7:02
I built TFF 5.0 with auto vectorization on for just JavaScript, built with
gcc-4.0.1.
The results in dromaeo recommended benchmarks:
TFF 5.0 7450 (gcc-4.0.1)
55,66 runs/s
TFF 5.0 7450 (gcc-4.0.1, auto vectorization for JS)
56,92 runs/s
Aurora 5.0 7400 (gcc-4.6, auto vectorization for JS and NSPR) 59,50 runs/s
Not sure if those results are reproducible. But to me it seems for gcc-4.0.1
turning on auto vectorization isn't worth the trouble.
Original comment by Tobias.N...@gmail.com
on 18 Jun 2011 at 5:52
No, I agree, that doesn't seem worth the risk. I am planning to make some
headway on methodjit for Fx7 anyway. After that, it might be worth switching
compilers to get this bonus.
However, is it truly the autovectorization that makes the run faster, or is it
that 4.6 makes better code?
Original comment by classi...@floodgap.com
on 18 Jun 2011 at 10:01
Without autovectorization I got 55-56 runs/s on the same codebase compiled with
gcc-4.2 as well as with 4.6 and now with 4.0.1.
Switching on the autovectorization gave the improvement, see Comments 20 and
23. However, there seem to have been substantial changes regarding optimization
in the more recent versions of gcc. And the Mac OS X support in gcc-4.6 seems
to be nearly mature already (although it's the first version to support it) -
at least I could build a running TenFourFox/Aurora 5 with it and it performed
quite well. The main difficulty was that auto vectorization causes linker (!)
failures for Objective-C/C++-code.
Original comment by Tobias.N...@gmail.com
on 18 Jun 2011 at 10:55
Some speed testing of TFF7 on a PowerBook G4 1,5 GHz running 10.4.11 set to
highest performance:
TFF7.0b1 7450 (official build) : 50,3 runs/s
TFF7.0b 7400 (based on Fx7.0b4): 57,3 runs/s
The custom one was compiled with gcc-4.6.1 whose "-O3" (which is what is used
for most of the code) includes auto vectorization.
I attached a patch that contains the necessary patches to build with gcc 4.6.
It removes the unneeded "-fpascal-strings" flag, replaces apple specific "-Oz"
with "-Os", replaces apple specific "-Wmost" with "-Wall -Wno-parentheses" and
adds "-mno-altivec -mabi=no-altivec" to the compiler flags for objective-c++
sources to work around a problem related to auto-vectorization.
Additionally one needs to add "-flax-vector-conversions" to both compiler
commands and "-fpermissive" to the CXX command in the used mozconfig as well as
delete the "-arch ..." which doesn't exist in non-apple gcc. Of course the
compiler commands have to point to the gcc 4.6 respectively g++ 4.6 executables.
Original comment by Tobias.N...@gmail.com
on 8 Sep 2011 at 8:37
Attachments:
In the above speed test there was some Spotlight activity so I repeated the
measuring.
Another speed testing of TFF7 on a PowerBook G4 1,5 GHz running 10.5.8 set to
highest performance:
TFF7.0b1 7450 (official build) : 55,6 runs/s
TFF7.0b 7400 (based on Fx7.0b4): 59,7 runs/s
Original comment by Tobias.N...@gmail.com
on 9 Sep 2011 at 8:57
I'm investigating moving to gcc-4.2.1 to see how far we get.
So far there have been two issues. One is that the 4.2.1 include files don't
get seen by the Mozilla build system unless we forcibly add the directory,
i.e., in .mozconfig,
CC="gcc-4.2 -arch ppc -I/usr/lib/gcc/powerpc-apple-darwin8/4.2.1/include"
CXX="g++-4.2 -arch ppc -I/usr/lib/gcc/powerpc-apple-darwin8/4.2.1/include"
This appears to work. The other problem is that as crashes out complaining
about --gdwarf2 as an illegal option while building NSPR. Tobias' gcc-4.2.1
simply symlinks /usr/libexec/gcc/powerpc-apple-darwin8/4.2.1/as to /usr/bin/as.
I might write a shim as that twiddles options for it. Tobias, I'm not sure if
you will want that shim in your distro.
Original comment by classi...@floodgap.com
on 12 Nov 2011 at 6:03
Also, this is just to build a DEBUG build -- I haven't tried opt builds with it
at all yet.
Original comment by classi...@floodgap.com
on 12 Nov 2011 at 6:04
Here's the shim. This makes libffi and nsprpub happy.
Original comment by classi...@floodgap.com
on 12 Nov 2011 at 6:15
Attachments:
And the working DEBUG4.2.mozconfig (tested only with js so far):
Original comment by classi...@floodgap.com
on 14 Nov 2011 at 12:32
Attachments:
Well, built a full debug version of the browser in 4.2.1 and it can't stand up
-- one of the scoped_ptr templates in Chromium generates incorrect code,
apparently. It asserts very early in startup.
JS works great, though. Maybe we can merge the two. Or, clang.
Original comment by classi...@floodgap.com
on 19 Nov 2011 at 4:04
Hmm... I built a lot of stuff using that gcc-4.2.1 including gcc-4.6 and many
other stuff.
I don't remember if I ever built TFF with it.
Did you untar the tarball to / in order to install it?
But with gcc-4.6 it works very well - you already tested a version of TFF built
with it.
Original comment by Tobias.N...@gmail.com
on 23 Nov 2011 at 9:35
Reading my previous comments I see that I did indeed build a full release
version of TFF with gcc-4.2.1 and I did not have any issues with it. Only for
4.6.1 I had to tweak it a little.
There's something wrong - maybe you need to use more recent cctools as well as
this will update the assembler. I'd recommend to use cctools-800. That might
solve some of the issues you encountered.
Original comment by Tobias.N...@gmail.com
on 23 Nov 2011 at 9:42
Original comment by Tobias.N...@gmail.com
on 23 Nov 2011 at 9:53
Could be that, but I would expect that if it were cctools, it might not build
at all (rather than improperly), and some things do work. I'll try that when I
get around to trying 4.2 again. For now 4.0.1 is still working, so we will
still use that as the supported compiler. It looks like we've successfully
worked around the JIT issues that Apple had with 4.0 on x86 (or maybe it was a
bug in the x86 backend).
Explain the modifications you had to make for 4.6.1 (comment 19).
Original comment by classi...@floodgap.com
on 23 Nov 2011 at 9:55
The later cctools will support "--gdwarf2". And I did never have to add the gcc
header path manually for any project.
Building with gcc-4.2.1 worked out of the box for me - but I had recent cctools
and ld64 installed in place of the ones shipped with Xcode-2.5 .
The same goes for gcc-4.6.1; I only ever used (and also built) it using recent
cctools and ld64 in place of the "original" ones.
I guess you have to upgrade the whole toolchain and not just a part of it.
That's definitely the way to go. Say goodbye to the Xcode-2.5 toolchain and say
hello to the Xcode-3.2 toolchain - as a whole.
The modifications to gcc-4.6 header are in
"/usr/lib/gcc/powerpc-apple-darwin8/4.6.1/include" (maybe also something in
"/usr/lib/gcc/powerpc-apple-darwin8/4.6.1/include"). The header files that
Apple shipped with gcc/g++-4.2.1 aren't compatible with gcc/g++-4.6 so there
was one I had to remove completely and a couple of others I had to add or
modify. Some template stuff in tere which I didn't understand completely was
broken as well and I was able to fix it but it still needs passing
"-fpermissive" to g++ in order to compile some stuff (some mozilla files as
well).
Original comment by Tobias.N...@gmail.com
on 23 Nov 2011 at 10:22
That's probably the issue then. I think when I try this again, I'll just jump
straight to 4.6.
Original comment by classi...@floodgap.com
on 23 Nov 2011 at 10:30
I'd like to try a build of TFF 9.0 beta using the Xcode 3.2 toolchain. That
would take me some days to complete. Would you mind uploading the current
patchset?
Original comment by Tobias.N...@gmail.com
on 28 Nov 2011 at 6:51
Give me a week for that -- I'm trying to shake out random crashes from 9. I
have it narrowed down to one of the VMX patches (building G5 opt with
--enable-tenfourfox-g5 but NOT --enable-tenfourfox-vmx yields a stable browser,
and DEBUG is also stable, so it must be a _VMX codepath that is causing
trouble), so I am bisecting to narrow down the culprit(s).
Original comment by classi...@floodgap.com
on 28 Nov 2011 at 3:28
Hmmm... none of those patches in their latest versions did crash 7 for me and
they even worked fine in firefox 8 on linux. All in all I ran those browsers
just a couple of hours.
Only thing I didn't have is FirstNon8Bit().
Original comment by Tobias.N...@gmail.com
on 28 Nov 2011 at 6:24
It's probably something about 9. I have a partially VMXed build working now,
with it narrowed down to one of
FirstNon8Bit()
BrowserStreamParent/Child (not likely, but it appears in the backtraces)
transform-vmx
nsUTF8ToUnicode
jpeg_consume_input
nsUTF8UtilsVMX.cpp
I'm going to enable each of them one by one and see if I can ID the culprit.
Original comment by classi...@floodgap.com
on 29 Nov 2011 at 1:26
Original issue reported on code.google.com by
classi...@floodgap.com
on 5 Apr 2011 at 10:44