Closed strogdon closed 8 years ago
I have seen that SIGCHILD problem with ecl before where was it?
Version of ecls and useflags please.
ledaig ~ # eix -I ecls
[I] dev-lisp/ecls
Available versions: 9.12.3 ~10.4.1 ~11.1.1-r1 ~12.2.1 ~12.7.1 ~12.12.1 ~12.12.1-r4^m[1] ~12.12.1-r5(0/12.12.1)^m (~)12.12.1-r5(0/12.12.1)^m[2] ~13.5.1(0/13.5.1) 13.5.1-r1(0/13.5.1)^t ~15.3.7(0/15.3.7)^t {X debug doc emacs gengc +libatomic precisegc sse (+)threads +unicode CPU_FLAGS_X86="sse"}
Installed versions: 13.5.1-r1^t(12:06:03 PM 06/02/2015)(X debug threads unicode -emacs -gengc -precisegc CPU_FLAGS_X86="sse")
Homepage: http://ecls.sourceforge.net/
Description: ECL is an embeddable Common Lisp implementation
[1] "lisp" /var/lib/layman/lisp
[2] "local-overlay" /usr/local/portage
I don't think the specific example
sage: g(x) = x^2-2*x-2
sage: plot(1/g(x), (x, -3, 4), exclude = g(x) == 0, ymin = -5, ymax = 5) # long time
Graphics object consisting of 3 graphics primitives
is the problem since the above produces a .png
plot when issued from the Sage prompt.
Hum I have been on 15.3.7 for a while now. I am suspecting your prefix is on 15.3.7. I think the threads
useflag is responsible for that behavior but I am not sure.
This is what I have in prefix
strogdon@blitzen ~ $ eix -I ecls
[I] dev-lisp/ecls
Available versions: *9.12.3 ~*10.4.1 ~*11.1.1-r1 ~*12.2.1 12.7.1 12.12.1 ~*12.12.1-r5(0/12.12.1)^m [m](~*)12.12.1-r6(0/12.12.1)^m[1] ~*13.5.1(0/13.5.1) (*)13.5.1-r1(0/13.5.1)^t [m](*)13.5.1-r1(0/13.5.1)^t[1] ~*15.3.7(0/15.3.7)^t {X debug doc emacs gengc +libatomic precisegc sse (+)threads +unicode CPU_FLAGS_X86="sse"}
Installed versions: 13.5.1-r1^t(07:05:23 PM 02/04/2015)(threads unicode -X -debug -emacs -gengc -precisegc CPU_FLAGS_X86="sse")
Homepage: http://ecls.sourceforge.net/
Description: ECL is an embeddable Common Lisp implementation
[1] "local-overlay" /storage/strogdon/local-overlay
I guess the debug
useflag may deserve attention.
debug
is now gone. Rebuilt sage. Issue still present. Looks like vanilla disables threads
. Will try that next, but 6.8.beta2 tested fine with the ecls above.
Is the doctest sage/libs/ecl.pyx
failing too? I think it should fail with threads enabled.
Yes it fails because with threads there is
ECL_OPT_THREAD_INTERRUPT_SIGNAL = 36
where the expected is
ECL_OPT_THREAD_INTERRUPT_SIGNAL = 0
Well building ecls without threads did not fix things. The sage/libs/ecls.pyx
doctest does now pass. Additionally, I had to manually rebuild maxima after removing threads from ecls in order to get the sage-docs to build. It was clear it was either ecls or maxima from the error message and since ecls was just rebuilt I rebuilt maxima. The problem was with accessing a nonexistant
or corrupt memory address
. But, I still have the AssertionError
. And of course sage was rebuilt.
Hum, run out of things to test. Was ecls recently rebuilt?
Aside from today it looks as though the last time ecls was built was with issue https://github.com/cschwan/sage-on-gentoo/issues/356; about a month and a half ago. At that time, for whatever reason, boehm-gc was also rebuild. But that's when sage-clib-9999 was around.
This may not be the issue (probably isn't), but I'm unable to build ecls without some linking with threads
eix -I ecls
[I] dev-lisp/ecls
Available versions: 9.12.3 ~10.4.1 ~11.1.1-r1 ~12.2.1 ~12.7.1 ~12.12.1 ~12.12.1-r4^m[1] ~12.12.1-r5(0/12.12.1)^m (~)12.12.1-r5(0/12.12.1)^m[2] ~13.5.1(0/13.5.1) 13.5.1-r1(0/13.5.1)^t ~15.3.7(0/15.3.7)^t {X debug doc emacs gengc +libatomic precisegc sse (+)threads +unicode CPU_FLAGS_X86="sse"}
Installed versions: 13.5.1-r1^t(08:31:32 PM 07/23/2015)(unicode -X -debug -emacs -gengc -precisegc -threads CPU_FLAGS_X86="sse")
Homepage: http://ecls.sourceforge.net/
Description: ECL is an embeddable Common Lisp implementation
[1] "lisp" /var/lib/layman/lisp
[2] "local-overlay" /usr/local/portage
and
ldd -r /usr/lib64/libecl.so
linux-vdso.so.1 (0x00007fff98ad3000)
libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007f8751561000)
libgc.so.1 => /usr/lib64/libgc.so.1 (0x00007f87512fa000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f87510b7000)
libm.so.6 => /lib64/libm.so.6 (0x00007f8750db7000)
libc.so.6 => /lib64/libc.so.6 (0x00007f8750a1c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8750800000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8751cd5000)
This appears to be because of boehm-gc
which even with USE=-threads
is built as
eix -I boehm-gc
[I] dev-libs/boehm-gc
Available versions: 6.8 7.1-r1 ~7.1-r99[1] 7.2d 7.2d-r1 7.2e ~7.4.0 7.4.2 {cxx nocxx static-libs threads}
Installed versions: 7.4.2(11:04:01 AM 07/28/2015)(cxx threads -static-libs)
Homepage: http://www.hboehm.info/gc/
Description: The Boehm-Demers-Weiser conservative garbage collector
[1] "local-overlay" /usr/local/portage
Yes threads is probably not to blame. Possibly something in maxima, That call to ecl is probably to load maxima.
Which version of maxima
? I know it should be pinned but we never know.
[I] sci-mathematics/maxima
Available versions: 5.18.1 ~5.20.1-r99[2] 5.34.1 (~)5.35.1-r2[1] ~5.36.1 {X clisp clozurecl cmucl ecl ecls emacs gcl latex nls openmcl sbcl tk unicode xemacs LINGUAS="es pt pt_BR"}
Installed versions: 5.35.1-r2[1](06:24:59 PM 07/23/2015)(X clisp ecls emacs nls sbcl tk unicode -clozurecl -cmucl -gcl -latex -xemacs LINGUAS="-es -pt -pt_BR")
Homepage: http://maxima.sourceforge.net/
Description: Free computer algebra environment based on Macsyma
[1] "sage-on-gentoo" /var/lib/layman/sage-on-gentoo
[2] "local-overlay" /usr/local/portage
In sage/lib/ecl.pyx
commented out
assert sage_action[SIGCHLD].sa_handler == NULL # Sage does not set SIGCHLD handler
rebuilt sage and doctested sage/plot/plot.py
and the following line
assert sig_test.sa_handler == NULL # And ECL bootup did not set one
produced an AssertionError
. Additionally comment out that line, rebuild sage and doctested plot.py
and all tests passed?
What about commenting out just the line in plot.py
?
No AssertionError
in doctesting plot.py
with these lines
Excluded points can also be given by an equation::
sage: g(x) = x^2-2*x-2
sage: plot(1/g(x), (x, -3, 4), exclude = g(x) == 0, ymin = -5, ymax = 5) # long time
Graphics object consisting of 3 graphics primitives
removed. It appears the exclude = g(x) == 0
is the source of the issue which calls maxima
. However it is only an issue from the doctesting framework. As mentioned above the commands from the sage prompt work as expected.
Only from the doctesting framework. Hum....
If the Traceback of the failure is to be believed it seems that init_ecl()
from ecl.pyx
is called. And if init_ecl()
is called I don't how that a SIGCHLD
handler is installed - unless
ecl_set_option(ECL_OPT_TRAP_SIGCHLD, 0);
is not doing what it is supposed to do. I must be missing something. Also, at http://trac.sagemath.org/ticket/14636#comment:16 there is mention of doctesting messing with SIGCHLD
And it is supposed to have been handled in http://trac.sagemath.org/ticket/15441 but this is definitively the kind of issues you are having I think. As always in those cases, the question is why just you?
If I doctest the following (items taken from plot.py
):
sage: from pylab import *
sage: t = arange(0.0, 2.0, 0.01)
sage: s = sin(2*pi*t)
sage: P = plot(t, s, linewidth=1.0)
sage: xl = xlabel('time (s)')
sage: yl = ylabel('voltage (mV)')
sage: t = title('About as simple as it gets, folks')
sage: grid(True)
sage: savefig(os.path.join(SAGE_TMP, 'sage.png'))
sage: imshow([[(0,0,0)]])
<matplotlib.image.AxesImage object at ...>
sage: savefig(os.path.join(SAGE_TMP, 'foo.png'))
sage: reset()
sage: g(x) = x^2-2*x-2
sage: plot(1/g(x), (x, -3, 4), exclude = g(x) == 0, ymin = -5, ymax = 5) # long time
Graphics object consisting of 3 graphics primitives
I get the AssertionError
. But doctesting just
sage: g(x) = x^2-2*x-2
sage: plot(1/g(x), (x, -3, 4), exclude = g(x) == 0, ymin = -5, ymax = 5) # long time
Graphics object consisting of 3 graphics primitives
gives no error. I suspect for some reason the reset()
is not functioning properly here?
OK, where is reset coming from?
reset
is defined in multiple python modules, I am not sure where that one is supposed to come from.
I think it's the reset
defined in sage/misc/reset.pyx
.
After considerable debugging the issue is matplotlib-related. At least for 1.4.2 and I suspect for 1.4.3. My matplotlib was built with USE="-gtk tk cairo qt4 wxwidgets"
which resulted in the AssertionError
. I don't know which useflag is the culprit, but building matplotlib with USE="-gtk -tk -cairo -qt4 -wxwidgets"
allowed plot.py
to be tested successfully. Building matplotlib with USE="-gtk -tk cairo qt4 wxwidgets"
was not sufficient to remove the AssertionError
.
I should note that matplotlib-1.3.1
had been built with USE="-gtk tk cairo qt4 wxwidgets"
and there was no problem when it was in use.
My suspect of choice is qt4
. I think I have seen bad things happen with it.
Yep. This works, USE="-gtk tk cairo -qt4 wxwidgets"
. I have had the -gtk
for some time.
That may be something to report in sage itself. I should check what happens in a vanilla build. There may be something Gentoo does to the qt4
binding that is strange.
qt4
bindings require PyQt4
it probably won't ever work in vanilla unless you install PyQt4
in the vanilla sage's python. There is a reset
definition in PyQt4
which may be what is interfering.
I can replicate this in Prefix with just the qt4
useflag. To find the problem on Gentoo I patched ecl.pyx
to bypass the assert
statements but to, nevertheless report when SIGCHLD
had been set. By inserting a from sage.libs.ecl import *
at various places in my above doctest example I discovered that pylab.plot
was messing with SIGCHLD
, at least from within the doctesting environment when matplotlib was built with qt4
.
Something probably needs to be set before calling qt
investigating.
FWIW, if I insert from sage.libs.ecl import init_ecl
in the section in forker.py
labeled # Do this once before forking.
then I don't get the AssertionError
in doctesting plot.py
with matplotlib built having USE="qt4"
. This is all very strange.
But probably enough to make noise upstream. I will patch 9999 onwards, even if we don't get traction upstream. Patching forker.py
seems a good idea, it's already full of stuff that are of that kind it seems.
Hum before I go the forker.py
patch route, does it make the problem go away when you do the manual test in https://github.com/cschwan/sage-on-gentoo/issues/363#issuecomment-133755247 ?
Yes it does. I'm running the testsuite to see if there is any collateral damage.
Doctests look good here in prefix. I have one difference compared to 6.8.rc1 - coding_in_other.rst
now fails, but that is an ordering of output thing. Results on Gentoo will take a bit longer but I don't foresee a problem - there is no AssertionError
in doctesting plot.py
. And the qt4
useflag is turned on
eix -I matplotlib
[I] dev-python/matplotlib
Available versions: ~1.3.1^m[1] [m]~1.3.1^m[2] 1.4.2^m ~1.4.3^m **9999^m {cairo doc examples excel fltk gtk gtk3 latex pyside qt4 qt5 test tk wxwidgets PYTHON_TARGETS="python2_7 python3_3 python3_4"}
Installed versions: 1.4.2^m(05:22:32 PM 08/25/2015)(cairo qt4 tk wxwidgets -doc -examples -excel -fltk -gtk -gtk3 -latex -pyside -test PYTHON_TARGETS="python2_7 python3_3 -python3_4")
I used to have the failure in coding_in_other.rst
and another one that was doing the same thing. Some output of singular. They seem to have gone in my last rebuild, may be the new flint helped me.
Gentoo results look good.
With qt4 being cleaned from the tree, I am thinking of removing the patch. (sage-7.4-qt4-conflict.patch
). I am wondering if other interfaces also trigger the behavior.
This is a new one for me on Gentoo. Things are fine on my Prefix.