Closed kcrisman closed 13 years ago
Can you try the package for PolyBoRi 0.7.1 from #11261? Unfortunately, I do not have access to that platform.
Nice idea! But the error is the same.
I'm not even sure this is a PolyBoRi problem per se. The activity monitor shows that there is still a fair amount of RAM left during this process, and while the CPU is being intensely used, the highest level of use is the unzipping, and otherwise it's not really different from other things.
My very rough guess at this point is that we may want to try setting the optimization level down a bit here, but I don't know how to do that.
Another thing is that if you can give me a VERY easy way to try to build this from scratch, without Sage (though perhaps using Sage Python etc.) on this machine, I could see if that works.
Replying to @kcrisman:
Another thing is that if you can give me a VERY easy way to try to build this from scratch, without Sage (though perhaps using Sage Python etc.) on this machine, I could see if that works.
First of all enter the Sage environment (This does not start Sage, just a shell with all environment variables and paths defined):
./sage -sh
Then extract the package
tar -xvjf polybori-0.7.0.p6.spkg
Enter the directory the newly generated directory and start the package building manually
cd polybori-0.7.0.p6; ./spkg-install
Alternatively, use PolyBoRi's build system directly:
cd polybori-0.7.0.p6/src/polybori-0.7; scons .
(The .
is mandatory to build everything.)
Maybe the problem is causes by the nested header inclusions. Perhaps the compiler is running out of stack.
You may try the following out: In the beginning of the headers in the polybori-0.7.0.p6/src/polybori-0.7/*/include
directories you'll find something like the following:
#ifndef FileName_h_
#define FileName_h_
Sometimes there are #include <header>
statement above this lines. It could save stack, if you move the #include
s below the #define
s.
Perhaps you can just try out those headers which are mentioned in the error message above
Best regards, Alexander
Since I am not sure which of those header files are the 'right' ones, I am reluctant to take a lot of time with that yet. Using the spkg-install does same thing. Interestingly, using PolyBoRi's system does not start building with BoolePolyRing.cc, as it does with Sage, but rather !cuddAPI.c. Is that significant?
But yes, on these older machines, perhaps the specific combination of old gcc, old processor, makes for the problem. Certainly a segmentation fault sounds like a compiler running out of something; I assume this is where the concept of stack overflow comes from (I just know the website...).
Okay, tried it from the build system in PolyBoRi itself. Went very well for quite some time, but then failed on exactly the same file as before (BoolePolyRing.cc), though of course it had done all kinds of other stuff first. Interestingly, it created BoolePolyRing.os fine, it was in making BoolePolyRing.o that the troubles came (as with the Sage build).
I will try to check what my other computer with this setup failed at; I don't think it was the same file.
Here is where a nearly identical machine - also XCode 2.5, but 1.25 !GHz instead of 770 !MHz, and 1 GB memory instead of 512 MB memory - fails - rather further into the compilation.
g++ -o groebner/src/groebner_alg.os -c -O3 -Wno-long-long -Wreturn-type -g -fPIC -ftemplate-depth-100 -O3 -Wno-long-long -Wreturn-type -g -fPIC -fPIC -fvisibility=hidden -DNDEBUG -DHAVE_GD -DHAVE_HASH_MAP -DPACKED -DHAVE_M4RI -DHAVE_GD -DHAVE_IEEE_754 -I/Users/crisman/sage-4.7.rc2/local/include -I/Users/crisman/sage-4.7.rc2/local/include/python2.6 -Ipolybori/include -ICudd/obj -ICudd/util -ICudd/cudd -ICudd/mtr -ICudd/st -ICudd/epd groebner/src/groebner_alg.cc
[address=019fffff pc=003f17d8]
In file included from polybori/include/pbori_traits.h:20,
from polybori/include/pbori_func.h:19,
from polybori/include/BooleSet.h:22,
from polybori/include/CTermStack.h:39,
from polybori/include/CStackSelector.h:23,
from polybori/include/COrderedIter.h:27,
from polybori/include/COrderingFacade.h:25,
from polybori/include/LexOrder.h:20,
from polybori/include/pbori_order.h:25,
from polybori/include/polybori.h:33,
from groebner/src/groebner_alg.h:12,
from groebner/src/groebner_alg.cc:10:
polybori/include/pbori_defs.h:1: internal compiler error: Segmentation Fault
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://developer.apple.com/bugreporter> for instructions.
scons: *** [groebner/src/groebner_alg.os] Error 1
scons: building terminated because of errors.
Error building PolyBoRi.
real 11m53.429s
user 9m43.204s
sys 1m24.262s
What is the difference between the .o and .os files, anyway? Are both needed for Sage?
As ususal: the .os are prepared for shared libs (compiled with .-fPIC).
Did you try out the changes in the src directory I suggested above?
Replying to @alexanderdreyer:
As ususal: the .os are prepared for shared libs (compiled with .-fPIC).
Did you try out the changes in the src directory I suggested above?
Well, given that I don't know what .os files are, you might imagine it would take me quite a while to try something like that :)
I have tried moving the #include
statements below the #ifndef
in two files that seemed likely - COrderedIter.h
and pbori_order.h
. They are the only two which show up in both traces. I did this 100% by hand, and I hope that running spkg-install from ./sage -sh in that directory will be enough to use the modified files. It already made it past groebner_alg.os
, which is a good sign... but this has been bad enough that I will wait to see.
While I'm waiting, I'll ask the dumb question: are all these include statements supposed to be inside the #ifndef
usually? If so, why aren't they? If not, why would this make a difference?
Done installing PolyBoRi.
SAGE_ROOT=/Users/crisman/sage-4.7.rc2
(sage subshell) new-host:polybori-0.7.0.p2 crisman$
Well, that seemed to work! Again, I only did it with these two files.
Now I am going to revert the change to COrderedIter.h
and keep only the one to pbori_order.h
, and try it again.
Another update - the change to pbori_order.h
did not fix the problem on the slower machine (the one that failed at g++ -o polybori/src/BoolePolyRing.o
) but adding the change to COrderedIter.h
did allow it to proceed past that very first file (hasn't finished yet).
And the attempt with only pbori_order.h
changed on the newer machine led to the exact same failure as before. So it seems that this is the problem.
So why is it the problem? I do not want to have to chase this all down again with every PolyBoRi upgrade, as you can imagine. If there is a good reference for why this would make the difference online, that would be great.
And if doing this on all files would help solve it in the future anyway, that would be fine too :)
Yup, changing COrderedIter.h
to move the #include
statements did the trick on both machines.
If you include something about this - and explanation!!! since this all seems quite magical to me - on #11261, then that would close this ticket as well. Or a separate spkg update could be made for this. Naturally, I would be willing to test them.
Replying to @kcrisman:
Yup, changing
COrderedIter.h
to move the#include
statements did the trick on both machines.
If you include something about this - and explanation!!! since this all seems quite magical to me - on #11261, then that would close this ticket as well. Or a separate spkg update could be made for this.
I will update and rebundle #11261 later (I need to rebase it on the new 0.7.0 spkg, when accepted).
The explanation is as follows: Without the patch the compiler hast to open and stack a lot of headers which are immediately closed, because the current compilation procedure had already entered them. But anyway: they were stacked. With the patch these files are never opened.
Since this never caused problems before, I did not take care on the order of #include
and {{#ifndef}}/#define
statements. (This will be fixed upstream also soon.)
Naturally, I would be willing to test them.
Nice, please have a look here: http://boxen.math.washington.edu/home/dreyer/spkg/polybori-0.7.0.p3.spkg
Attachment: polybori-0.7.0.p3-vs-p2.patch.gz
spkg-patch (just to simplify reviewing)
Upstream: Fixed upstream, in a later stable release.
Author: Alexander Dreyer
Hi,
what a nice "ready-to-go" new spkg and "p3-vs-p2" patch, presented on "a silver tablet" --- I just couldn't resist!
Both my MacIntel and my MacPPC systems use OS X 10.4.11 with XCode 2.5 (the MacIntel has a Core2Duo CPU with 2 GHz and 2 GB RAM, the MacPPC has an older G4 CPU with 550 MHz and 768 MB RAM). These are the findings from my side (all one-time tries only):
On my MacIntel with Sage-4.7.alpha4 (and polybori-0.7.0.p2), building from scratch went fine, but with Sage-4.7.rc2 (and the very same polybori-0.7.0.p2), building from scratch broke with the described internal compiler error.
On my MacPPC with Sage-4.7.rc2 (and still polybori-0.7.0.p2), building from scratch went fine(!!).
So the issue of this ticket firstly does not seem to hit in 100% of the cases, and secondly seems to affect both the MacIntel and MacPPC platforms ...
I updated on both systems the Sage-4.7.rc2 install with the new polybori-0.7.0.p3 spkg of this ticket, and on both systems they did build fine. On the MacIntel, "make testlong" finished in the meantime, and passed fine (except for the known old "maxima.py" issue, which is unrelated).
The SPKG.txt is updated correctly, even the mercurial repository looks good, excellent! The only downside might be that the problem still lurks, since only more testing really could give confidence. But the best way to achieve the latter is to drop this spkg in the mainline code base, which is justified, because regressions are hardly to be awaited from looking at the tiny (and very local) changes.
All in all: positive review.
Reviewer: Georg S. Weber
Wow, great review, Georg. How bizarre about the Intel/PPC thing. But is your PPC a G5 or G4? We only saw it on G4 machines until this report.
David Kirkby would surely warn us about more compiler madness waiting in the wings if we don't fix the headers, but I know nothing of such things, so this all looks great. I also just independently checked this worked by dropping in this spkg in a freshly untarred rc2 on one of the failing machines, so considering it worked (though not with from scratch) on the only other machine I saw it on, this should be very positive review indeed. I'm adding myself in to the reviewers based on the previous work, if that's okay.
Changed reviewer from Georg S. Weber to Georg S. Weber, Karl-Dieter Crisman
Replying to @kcrisman:
But is your PPC a G5 or G4?
It's a 550 MHz G4 PowerPC, the "older one" of the G4's used ("7400" for e.g. TenFourFox)
I'm adding myself in to the reviewers based on the previous work, if that's okay.
Sure! Having slept over it, I myself felt that I should not have put only my name in that reviewer field, but yours, too. But you were even faster ...
Description changed:
---
+++
@@ -4,7 +4,7 @@
Depending on the machine and the version, the error seems to hit at different places, but in the end it's always failing. See for example [this sage-release thread](http://groups.google.com/group/sage-release/browse_thread/thread/b41ef4f3dd2c1be0/127b263a05c1cbab?show_docid=127b263a05c1cbab).
-It appears to be a compiler bug, perhaps a memory issue.
+It is a bug in gcc.
This is with sage-4.7.rc0 through rc2. alpha5 is unaffected, which is truly bizarre. Since Singular's latest (rc2, p9) builds fine on these machines, it might be the bzip2 package upgrade that is the issue, unlikely though this may seem.
Description changed:
---
+++
@@ -241,3 +241,5 @@
Error building Sage.
make: *** [build] Error 1
+ +New spkg: http://boxen.math.washington.edu/home/dreyer/spkg/polybori-0.7.0.p3.spkg
AlexanderDryer: could you please upload the spkg without the added hg
tag? My merge script gets confused with the already-added tag.
Replying to @jdemeyer:
AlexanderDryer: could you please upload the spkg without the added
hg
tag? My merge script gets confused with the already-added tag.
No problem! The new pkg is at the same location: http://boxen.math.washington.edu/home/dreyer/spkg/polybori-0.7.0.p3.spkg
Hmm, the only difference I can see is that "hg tags" outputs one line less than before (the line with "polybori-0.7.0.p3" is now missing), i.e. the hidden file ".hgtags" has one line less. I didn't know that was important, sorry. But I guess this is what was meant and needed, so positive review renewed.
Replying to @sagetrac-GeorgSWeber:
Hmm, the only difference I can see is that "hg tags" outputs one line less than before (the line with "polybori-0.7.0.p3" is now missing), i.e. the hidden file ".hgtags" has one line less. I didn't know that was important, sorry.
No need to apologize. I am rewriting the spkg handling of my merger script. One change is that it now automatically adds a hg tag, and an already-existing tag will confuse the scripts. Anyway, thanks for the review.
Replying to @kcrisman: Why changing the milestone so late? I already assumed it was going into the sage-4.7.1 cycle.
Why changing the milestone so late? I already assumed it was going into the sage-4.7.1 cycle.
I think this was probably opened after the default milestone was 4.7.1; similarly with the priority. I never realized it wouldn't make it in. But this does prevent building of Sage, so I guess I kind of assumed it should be in an 'official' release. Especially if we want to produce binaries for 4.7, since several of the machines people use to do this would be affected.
Merged: sage-4.7.rc4
With both XCode 2.4.1 and 2.5, builds consistently fail at PolyBoRi on Mac OS X 10.4 on PowerPC G4 chips.
No other platforms seem to be affected, including G5 chips and 10.5 on a G4.
Depending on the machine and the version, the error seems to hit at different places, but in the end it's always failing. See for example this sage-release thread.
It is a bug in gcc.
This is with sage-4.7.rc0 through rc2. alpha5 is unaffected, which is truly bizarre. Since Singular's latest (rc2, p9) builds fine on these machines, it might be the bzip2 package upgrade that is the issue, unlikely though this may seem.
E.g.,
New spkg: http://boxen.math.washington.edu/home/dreyer/spkg/polybori-0.7.0.p3.spkg
Upstream: Fixed upstream, in a later stable release.
CC: @jdemeyer @sagetrac-GeorgSWeber @alexanderdreyer @sagetrac-PolyBoRi
Component: build
Author: Alexander Dreyer
Reviewer: Georg S. Weber, Karl-Dieter Crisman
Merged: sage-4.7.rc4
Issue created by migration from https://trac.sagemath.org/ticket/11331