Closed malb closed 14 years ago
Hi Simon,
These messages ("File not found") are more or less typical when the doctests time out. I think people are working on improving that part of doctest reporting. Since t2 can be pretty slow, some timeouts are not unusual. I wouldn't worry about the ones you saw. If you really want to try again, try export SAGE_TIMEOUT_LONG=6000
to change the time out for long tests from 1800 seconds to 6000 seconds; then you shouldn't see these. (6000 seconds is a lot more than you should need, even on t2.)
Replying to @jhpalmieri:
These messages ("File not found") are more or less typical when the doctests time out. ... If you really want to try again, try
export SAGE_TIMEOUT_LONG=6000
OK, doing it now.
Thanks, Simon
I'v'e just edited
/usr/local/gcc-4.4.1-sun-linker/gcc441sun
the file I recommend people source, and added to it:
SAGE_TIMEOUT=1000
export SAGE_TIMEOUT
SAGE_TIMEOUT_LONG=6000
export SAGE_TIMEOUT_LONG
I've found on my own 900 MHz SPARC that the default SAGE_TIMEOUT_LONG (1800 s) is just about long enough, but the SAGE_TIMEOUT (360 s) is too far too short. I'm not even sure if 1000 seconds for SAGE_TIMEOUT is enough on t2, but it probably is.
Obviously one can unset those variables if one wants, but it would seem sensible to have the defaults so tests should pass.
Dave
Replying to @simon-king-jena:
Replying to @jhpalmieri:
These messages ("File not found") are more or less typical when the doctests time out. ... If you really want to try again, try
export SAGE_TIMEOUT_LONG=6000
OK, doing it now.
Again, the setting is: Sage 4.5.1 plus singular-3-1-1-4.patch plus singular-3-1-1-4.spkg. sage -ptestlong
on t2 with export SAGE_TIMEOUT_LONG=6000
results in exactly one doctest failure:
sage -t -long devel/sage/sage/parallel/decorate.py
**********************************************************************
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/devel/sage-main/sag
e/parallel/decorate.py", line 152:
sage: v = list(f([1,2,4])); v.sort(); v
Exception raised:
Traceback (most recent call last):
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/bin/nca
doctest.py", line 1231, in run_one_test
self.run_one_example(test, example, filename, compileflags)
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/bin/sag
edoctest.py", line 38, in run_one_example
OrigDocTestRunner.run_one_example(self, test, example, filename, compile
flags)
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/bin/nca
doctest.py", line 1172, in run_one_example
compileflags, 1) in test.globs
File "<doctest __main__.example_4[9]>", line 1, in <module>
v = list(f([Integer(1),Integer(2),Integer(4)])); v.sort(); v###line 152:
sage: v = list(f([1,2,4])); v.sort(); v
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/lib/pyt
hon/site-packages/sage/parallel/multiprocessing_sage.py", line 64, in parallel_i
ter
p = Pool(processes)
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/lib/pyt
hon2.6/multiprocessing/__init__.py", line 227, in Pool
return Pool(processes, initializer, initargs)
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/lib/pyt
hon2.6/multiprocessing/pool.py", line 104, in __init__
w.start()
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/lib/pyt
hon2.6/multiprocessing/process.py", line 104, in start
self._popen = Popen(self)
File "/scratch/sking/sage-4.5.1-Solaris_10_SPARC-sun4u-SunOS/local/lib/pyt
hon2.6/multiprocessing/forking.py", line 94, in __init__
self.pid = os.fork()
OSError: [Errno 12] Not enough space
**********************************************************************
1 items had failures:
1 of 12 in __main__.example_4
***Test Failed*** 1 failures.
For whitespace errors, see the file /home/SimonKing/.sage//tmp/.doctest_decorate
.py
[46.5 s]
So, not enough space. Does that mean the hard disc (or at least /scratch
or my home directory) is too full?
Anyway, I repeated the failing test separately, and then it succeeded:
sage subshell$ sage -t -long devel/sage/sage/parallel/decorate.py
sage -t -long "devel/sage/sage/parallel/decorate.py"
[48.8 s]
----------------------------------------------------------------------
All tests passed!
Total time for all tests: 48.8 seconds
So, can one say that all tests pass on t2?
Looking at the previous posts, it seems that the status is:
Do you agree that this is a positive review once the patch is rebased, and that William, John and myself should be added to the list of referees?
Cheers,
Simon
Changed reviewer from David Kirkby to David Kirkby, Simon King, William Stein, John Palmieri
Work Issues: rebase patch
Do you agree that this is a positive review once the patch is rebased, and that William, John and myself should be added to the list of referees?
That sounds okay to me. The rebasing should perhaps be coordinated with #9599 (the re-merging of #1396).
Replying to @jhpalmieri:
Do you agree that this is a positive review once the patch is rebased, and that William, John and myself should be added to the list of referees?
That sounds okay to me. The rebasing should perhaps be coordinated with #9599 (the re-merging of #1396).
I don't think that much co-ordination is needed here. Apparently this ticket is more important than #9599, and #9599 will certainly not happen without the patch from here.
So, I'll simply wait until a rebased version of singular-3-1-1-4.patch is available, and will then base on it a new version of my patch from #1396 for re-merging -- which then needs to be tested for segfault on t2, of course.
With #1396 backed out, this applies, so it doesn't need to be rebased. (One hunk applies "with fuzz".) The commit message doesn't include the trac number, but that's the only problem.
So should it get a positive review now?
Replying to @jhpalmieri:
With #1396 backed out, this applies, so it doesn't need to be rebased. (One hunk applies "with fuzz".)
Good!
The commit message doesn't include the trac number, but that's the only problem.
Well, I recently got a "needs work" for the same reason...
So should it get a positive review now?
I wouldn't oppose. Any veto?
Replying to @simon-king-jena:
Replying to @jhpalmieri:
With #1396 backed out, this applies, so it doesn't need to be rebased. (One hunk applies "with fuzz".)
Good!
The commit message doesn't include the trac number, but that's the only problem.
Well, I recently got a "needs work" for the same reason...
So should it get a positive review now?
I wouldn't oppose. Any veto?
Me neither. I'm going to set it to positive review.
Dave
Changed work issues from rebase patch to none
I'm working on 4.5.3.alpha0, which will contain a mix of spkg updates and repository patches from report {32}. Should I attempt to merge this ticket into 4.5.3 or wait for a dedicated Sage release? Also, can someone indicate exactly which spkg and patch to apply? Thanks!
Also also: Please ensure that any patches to merge include the ticket number in the first lines of their commit strings.
Replying to @qed777:
I'm working on 4.5.3.alpha0, which will contain a mix of spkg updates and repository patches from report {32}. Should I attempt to merge this ticket into 4.5.3 or wait for a dedicated Sage release? Also, can someone indicate exactly which spkg and patch to apply? Thanks!
Does this have to be called 4.5.3? I'd feel a lot happier calling it 4.6.0 if there's a major upgrade like Singular. But I think you should try to upgrade Singular.
I don't see the need of a dedicated release - there are lots of updates that have almost no chance of conflicting with this.
I guess I'm one of the people that believes increments in the last digit are just minor changes (bug fixes) and not major new components. It's just the release number I don't like - I think its right to merge this, and some .spkg updates. Pari seems to have stalled again, so I'd go for Singular.
That said, I seem to be in a minority who feel version numbers should reflect the sort of updates that take place. Almost everyone seems happy with random numbers!
Dave
Hi!
Replying to @qed777:
Also, can someone indicate exactly which spkg and patch to apply? Thanks!
I think it is singular-3-1-1-4.patch (but this needs to be rebased) and http://sage.math.washington.edu/home/malb/spkgs/singular-3-1-1-4.spkg
Best regards, Simon
Replying to @simon-king-jena:
Hi!
Replying to @qed777:
Also, can someone indicate exactly which spkg and patch to apply? Thanks!
I think it is singular-3-1-1-4.patch (but this needs to be rebased) and http://sage.math.washington.edu/home/malb/spkgs/singular-3-1-1-4.spkg
I think that's right, and I don't think it needs to be rebased. At least, I've built on a few machines using this combination, and the patch applied cleanly (albeit with some fuzz), and tests passed.
Attachment: singular-3-1-1-4.2.patch.gz
Updated commit string. Use with singular-3-1-1-4.spkg
.
Thanks. I'll use
There might still be a problem with parallel builds. With 4.5.2 on sage.math, I applied attachment: singular-3-1-1-4.2.patch, copied singular-3-1-1-4.spkg to SAGE_ROOT
, and ran
#!/bin/bash
set -o pipefail
JOBS=20
RUNS=50
for I in `seq $RUNS`;
do
LOG="singular-3-1-1-4-j$JOBS.log.$I"
if [ ! -f "$LOG" ]; then
env MAKE="make -j$JOBS" ./sage -f singular-3-1-1-4.spkg 2>&1 | tee "$LOG"
CODE=$?
echo $0 run $I of $RUNS: code= $CODE
fi
done
All runs ended with exit code 0 (maybe I didn't retrieve the code correctly?), but
grep "An error occurred" singular-3-1-1-4*log* | sort -n
shows that 21 of the 50 runs failed.
I've put the logs here.
According to
$ grep "No such file or dir" sing*log* | grep -v "cannot remove" | cut -d ':' -f 2- | sort | uniq -c
1 abs_fac.cc:2:21: error: factory.h: No such file or directory
1 bifac.cc:1:21: error: factory.h: No such file or directory
1 cntrlc.o: No such file or directory
11 g++: cntrlc.o: No such file or directory
28 g++: extra.o: No such file or directory
4 g++: feOpt.o: No such file or directory
1 g++: g++: cntrlc.o: No such file or directory
4 g++: g++: extra.o: No such file or directoryextra.o: No such file or directory
2 g++: misc_ip.o: No such file or directory
1 lgs.h:13:21: error: factory.h: No such file or directory
and some digging, the gcc -o gentable
and gcc -o gentable2
commands, at least, sometimes don't have all of their dependencies already built. For example, singular-3-1-1-4-j20.log.14
contains
g++ -O3 -g -fPIC -pipe -I. -I../kernel -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/include -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/include -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2
-singular/local/include -fno-implicit-templates -DNDEBUG -DOM_NDEBUG -Dx86_64_Linux -DHAVE_CONFIG_H -DGENTABLE \
-o gentable1 claptmpl.o iparith.cc tesths.cc mpsr_Tok.cc \
grammar.o scanner.o attrib.o eigenval_ip.o extra.o fehelp.o feOpt.o ipassign.o ipconv.o ipid.o iplib.o ipprint.o ipshell.o lists.o sdb.o fglm.o interpolation.o silink.o subexpr.o janet.o wrapper.o
libparse.o sing_win.o gms.o pcv.o maps_ip.o walk.o walk_ip.o cntrlc.o misc_ip.o calcSVD.o Minor.o MinorProcessor.o MinorInterface.o slInit_Dynamic.o -ldl -rdynamic -L../kernel -lkernel -L/mnt/usb1/scratch/mpat
el/tmp/sage-4.5.2-singular/local/lib -L/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/lib -lm -lsingfac -lsingcf -lntl -lgmp -lreadline -lncurses -lm -lomalloc ../kernel/mmalloc.o
g++ -O3 -g -fPIC -pipe -I. -I../kernel -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/include -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/include -I/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2
-singular/local/include -fno-implicit-templates -DNDEBUG -DOM_NDEBUG -Dx86_64_Linux -DHAVE_CONFIG_H -DGENTABLE \
-o gentable2 claptmpl.o iparith.cc tesths.cc mpsr_Tok.cc \
grammar.o scanner.o attrib.o eigenval_ip.o extra.o fehelp.o feOpt.o ipassign.o ipconv.o ipid.o iplib.o ipprint.o ipshell.o lists.o sdb.o fglm.o interpolation.o silink.o subexpr.o janet.o wrapper.o
libparse.o sing_win.o gms.o pcv.o maps_ip.o walk.o walk_ip.o cntrlc.o misc_ip.o calcSVD.o Minor.o MinorProcessor.o MinorInterface.o slInit_Dynamic.o -ldl -rdynamic -L../kernel -lkernel -L/mnt/usb1/scratch/mpat
el/tmp/sage-4.5.2-singular/local/lib -L/mnt/usb1/scratch/mpatel/tmp/sage-4.5.2-singular/local/lib -lm -lsingfac -lsingcf -lntl -lgmp -lreadline -lncurses -lm -lomalloc ../kernel/mmalloc.o
g++: extra.o: No such file or directory
g++: extra.o: No such file or directory
Please see singular-3-1-1-4-j20.log.47
for the factory.h
errors.
If I didn't make a mistake: What if we return to serial builds for this ticket but open a new one for building Singular in parallel?
I hope this ticket does finally get merged. It solves all my Solaris issues, and has less patches.
However, I would appreciate if someone could review #9397, which does not update Singular, but has a couple of changes needed to get 64-bit builds of Singular on Solaris.
All the changes are in this ticket, so if this ticket gets merged, then there's no need for #9397. But clearly there is a possibility this ticket will cause problems and not get merged, so it would be nice if #9397 could be merged in that case. But it needs review.
Dave
I would like to mention at this stage that we have adopted this patch in sage-on-gentoo and use it with singular-3.1.1.4 from the system (we had to fix the path for dlopen in singular.pyx - the only place in the whole sage spkg where there is a dlopen call). We are quite happy with it on x86 and amd64, I should fully test on ppc as well, probably over the next week end (it takes that much time).
Attachment: trac_8059_spkg-restore_serial_build.patch.gz
Restore serial builds. SPKG repo patch.
I've made
http://sage.math.washington.edu/home/mpatel/trac/8059/singular-3-1-1-4.p0.spkg
which restores building the package in serial, for now. The changes are in attachment: trac_8059_spkg-restore_serial_build.patch. The p0 package installs consistently for me on sage.math (47 good runs out of 47, so far).
Thoughts?
I think that changing to serial is fine for now. I just re-ran some of your tests, but with JOBS=4
instead of JOBS=20
. It worked fine. I've seen this before with some other packages, especially on t2: if you build with too many threads, it doesn't work, but it's fine with not as many. (For example, mpir pretty consistently doesn't seem to work for me on t2 with MAKE='make -j12', but MAKE='make -j4' is fine.)
Is this ready for review?
For what it's worth, with 4.5.3.alpha0 + singular-3-1-1-4.p0.spkg + attachment: singular-3-1-1-4.2.patch, the long doctests pass for me on bsd and sage.math.
Since the only change is to disable parallel building, I think we can restore the positive review.
I this this issue occurs only if you have a lot of CPU cores, but slow hard disks. Anyway, the patch at think !http://www.singular.uni-kl.de:8002/trac/ticket/250 should cure it.
Replying to @alexanderdreyer:
I this this issue occurs only if you have a lot of CPU cores, but slow hard disks. Anyway, the patch at think !http://www.singular.uni-kl.de:8002/trac/ticket/250 should cure it.
That looks a very trivial change. If I understand correctly, just one line
$(basefactorysrc:.cc=.o): factory.h
is added. But it might be better to put it on another ticket, otherwise this could drag on for ages - the ticket is already 7-months old.
Singular is one of the slowest packages build for me on my Ultra 27, taking about 8 minutes if I recall correctly. But I'm concerned that delaying this much longer will cause more problems than a few minutes of wall time will make.
BTW, I've run with 1000 threads on systems with only 4 cores. It's not optimal, but does allow a reasonable simulation of larger parallel builds to be made.
A big problem in my opinion, is that the server disk.math, which serves all the home directories has been mis-configured to increase the speed of NFS. So failures on home directories might not be code errors, but simply mis-configuration of the server.
Dave
Yeah, this should be another ticket. BTW, I also hat to a another trivial fix to fix the gentable issue during parallel build.
THe parallel build issue is now #9733.
Description changed:
---
+++
@@ -1,3 +1,6 @@
The Singular team accepted most of our patches upstream. They are in the 3-1-0-9 release, which also is a first step to make things easier for library developers.
+How to apply the patches to [sage-4.5.3.alpha0](http://sage.math.washington.edu/home/release/sage-4.5.3.alpha0/sage-4.5.3.alpha0.tar):
+* Install the new Singular spkg: [http://sage.math.washington.edu/home/mpatel/trac/8059/singular-3-1-1-4.p0.spkg](http://sage.math.washington.edu/home/mpatel/trac/8059/singular-3-1-1-4.p0.spkg)
+* Apply [singular-3-1-1-4.2.patch](https://github.com/sagemath/sage-prod/files/10647769/singular-3-1-1-4.2.patch.gz) to `devel/sage`
Merged: sage-4.5.3.alpha0
Changed merged from sage-4.5.3.alpha0 to sage-4.5.3.alpha1
Almost all generated files (configure
, grammar.cc
etc.) again carry the same time stamp as their sources... :(
I wonder if we should touch all of them in spkg-install
, not just two.
Replying to @nexttime:
Almost all generated files (
configure
,grammar.cc
etc.) again carry the same time stamp as their sources... :(
I don't know what method they use to create the tarball, but its hard to see how this can happen. Unless they touch all files before releasing, it's hard to understand how they can created a configure script with the same timestamp as the file it is generated from.
Perhaps they copy them with cp
at one point, and don't preserve modification times (cp -p
). Whatever it is they are doing, it is wrong.
I wonder if we should touch all of them in
spkg-install
, not just two.
Anything that should be touched should be.
I thought the upstream developers had learned at Sage Days 23.5, but perhaps 3.1.1.4 was just prepared too early...
No comment.
BTW, is this really 3-1-1-4, or is it a snapshot? If the latter, then it's quite possible someone in the Sage community created the configure script and did not preserve modification times. In which case, the upstream developers are not to blame.
Dave
Almost all generated files (
configure
,grammar.cc
etc.) again carry the same time stamp as their sources... :(
These files can be generated, but there is no need to do so (if you are not a Singular-kernel developer). In fact they are pre-generated in the repo: http://www.singular.uni-kl.de/svn/trunk/ (That's why the time stamps are equal.)
The reason for this is that the generators for these files (e.g. specific version of autotools and lex) are not available on all supported platforms.
Of course, there are attempts to change this, but this cannot be done within 2,5 days. If (and only if) the tools of Sage distribution are capable of generating these files, then the correct solution would be to to remove them complete during patching.
I thought the upstream developers had learned at Sage Days 23.5, but perhaps 3.1.1.4 was just prepared too early...
This wasn't at topic at SD23.5.
Replying to @alexanderdreyer:
Almost all generated files (
configure
,grammar.cc
etc.) again carry the same time stamp as their sources... :(These files can be generated, but there is no need to do so (if you are not a Singular-kernel developer). In fact they are pre-generated in the repo: http://www.singular.uni-kl.de/svn/trunk/ (That's why the time stamps are equal.)
The reason for this is that the generators for these files (e.g. specific version of autotools and lex) are not available on all supported platforms.
Of course, there are attempts to change this, but this cannot be done within 2,5 days. If (and only if) the tools of Sage distribution are capable of generating these files, then the correct solution would be to to remove them complete during patching.
IIRC, I got a message that it would try to generate the files using my autoconf, which is an older version, but it might not work. That's very dangerous.
Also, this means if the tools exist, the file get generated needlessly, slowing the build.
But it will not take 2.5 days to touch them in the spkg-install file.
Dave
Replying to @alexanderdreyer:
Almost all generated files (
configure
,grammar.cc
etc.) again carry the same time stamp as their sources... :(These files can be generated, but there is no need to do so (if you are not a Singular-kernel developer). In fact they are pre-generated in the repo: http://www.singular.uni-kl.de/svn/trunk/ (That's why the time stamps are equal.)
And exactly that can (and did) cause problems; the distributed generated files should be
See #9160 comment:22 (and #9160 comment:20 for the latter issue, or the whole ticket...) and also this thread on sage-devel.
Unfortunately we hadn't had the time to add more comments on this to SPKG.txt
, but I expected the people further working on the Singular spkg to be aware of these issues, or take a look at previous tickets related to the spkg. (And I didn't edit the SD 23.5 wiki page, because at that time it didn't seem appropriate, but asked for other people propagating these things.)
The reason for this is that the generators for these files (e.g. specific version of autotools and lex) are not available on all supported platforms.
Of course, there are attempts to change this, but this cannot be done within 2,5 days. If (and only if) the tools of Sage distribution are capable of generating these files, then the correct solution would be to to remove them complete during patching.
I thought the upstream developers had learned at Sage Days 23.5, but perhaps 3.1.1.4 was just prepared too early... This wasn't at topic at SD23.5.
Obviously unfortunately not, see above. I expected it to be one.
Obviously unfortunately not, see above. I expected it to be one.
Neither #9160 wasn't reported upstream, nor the spkg-maintainers were CCed.
Replying to @alexanderdreyer:
Obviously unfortunately not, see above. I expected it to be one.
Neither #9160 wasn't reported upstream, nor the spkg-maintainers were CCed.
Well, the spkg maintainer (and some of the people involved in this ticket here) actually participated the discussion on sage-devel (which referred to #9160).
(And I remember the spkg maintainer telling me "You don't need to cc anybody, we'll look through the tickets anyway..." ;-) )
That's why I had been a bit annoyed or disappointed, and the reason for the flame.
(And I remember the spkg maintainer telling me "You don't need to cc anybody, we'll look through the tickets anyway..." ;-) )
I guess it was not clear, that this is to be fixed upstream. (It's merely a packaging problem.)
That's why I had been a bit annoyed or disappointed, and the reason for the flame.
No reason to flam the upstream developers. In particular, as the ticket stated, that it had not been reported to them.
The Singular team accepted most of our patches upstream. They are in the 3-1-0-9 release, which also is a first step to make things easier for library developers.
How to apply the patches to sage-4.5.3.alpha0:
devel/sage
Upstream: Reported upstream. Little or no feedback.
CC: @sagetrac-PolyBoRi @sagetrac-drkirkby @aghitza @jaapspies @kiwifb @alexanderdreyer
Component: packages: standard
Author: Martin Albrecht
Reviewer: David Kirkby, Simon King, William Stein, John Palmieri
Merged: sage-4.5.3.alpha1
Issue created by migration from https://trac.sagemath.org/ticket/8059