Closed strogdon closed 3 years ago
The only thing I know for sure that calls java is our friend jmol
. If it is new, then some new plots must be involved. I'll note that vanilla is seriously working on dumping jmol
, I wouldn't even be too surprised if we got it optional in 9.3.
I know you are officially away so no need to answer. I thought of jmol
and I knew it could possibly be optional, but if I recall, it's needed for the pdf docs?
Yes, I gave myself until tomorrow to answer :) anyway most 3D plots in html and pdf doc are done through jmol which is the default plotter. threejs has a problem that you cannot do non interactive saving to a file. But jsmol may just bring all the functionality of jmol without java - although since it is javascript I am not sure how it works without at least a web-broswer in the background.
This is preliminary, I need to run the full doctests again. There is a JRE issue somewhere in sage/categories
with tp -2
. I have 4 threads. There are no issues with tp -1
. This is with openjdk-bin:8
. I switched to icedtea-bin:8
and there are no JRE issues with tp -5
.
Corrected: above it should have been icedtea-bin:8
. Doctests completed using tp -5
without a JRE issue when selecting java-vm
to be icedtea-bin-8
. Not sure what has changed. I wonder what other distos use?
I believe there is a slow move to openjdk
but debian may still be shipping icedtea
by default since they tend to stick to their own produced binaries.
icedtea-bin
is not the complete solution with vanilla
sage, although there appears to be fewer failures than with openjdk-bin
. From the failure log
Internal exceptions (2 events):
Event: 0.072 Thread 0x00007fa65000a000 Exception <a 'java/lang/NoSuchMethodError': Method sun.misc.Unsafe.defineClass(Ljava/lang/String;[BII)Ljava/lang/Class; name or signature does not match> (0x00000000dda07cc8) thrown at [/var/tmp/portage/dev-java/icedtea-3.16.0/work/icedtea-3.16.0/openjdk/
Event: 0.072 Thread 0x00007fa65000a000 Exception <a 'java/lang/NoSuchMethodError': Method sun.misc.Unsafe.prefetchRead(Ljava/lang/Object;J)V name or signature does not match> (0x00000000dda07fb0) thrown at [/var/tmp/portage/dev-java/icedtea-3.16.0/work/icedtea-3.16.0/openjdk/hotspot/src/share/
Not sure what the following means
vm_info: OpenJDK 64-Bit Server VM (25.252-b09) for linux-amd64 JRE (1.8.0_252-b09), built on May 10 2020 20:29:23 by "portage" with gcc 9.2.0
Feels like a language version mismatch. That kind of error is usually thrown when the arguments of a function are not the expected number, or the expected type. Because java, like C++ is object oriented, you can have function polymorphism, a same name can refer to slightly different methods depending on the arguments and the return type. The way to figure out which method is used is to compare so called "signature" of the different methods and the one you are trying to call.
So, because java is based on a runtime, I think something is missing compared to the version the program was written for.
Installed openjdk
instead of openjdk-bin
and I see no JRE
failures when testing vanilla. I'll now try with s-o-g. It was a chore to build openjdk
. There was a filesize mismatch in downloading openjdk-8.272_p10.tar.bz2
.
That was brave to build openjdk
. The fact that the errors are linked to minor version differences is also quite worrying.
It's taken a while. Testing s-o-g with openjdk
seems good. No JRE
issues. The openjdk-bin
above seems to have been built with gcc-9.2
? That plus your comments prompted the build of openjdk
. Since openjdk
seems to work here for now (cross fingers) I will not try to build icedtea
since it is not stable.
When s-o-g 9.3.beta6
is available I will run doctests with system built openjdk
. But on vanilla 9.3.beta6
I get
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 7022.5 seconds
cpu time: 24431.2 seconds
cumulative wall time: 34315.8 seconds
which is encouraging.
I cannot currently build sage from sage-on-gentoo because of sandbox violation from jmol.
F: mkdir
S: deny
P: /var/lib/portage/home/.java
A: /var/lib/portage/home/.java
R: /var/lib/portage/home/.java
C: /opt/icedtea-bin-3.16.0/bin/java -Xmx512m -Djava.awt.headless=true -jar /usr/share/sage-jmol-bin/lib/JmolData.jar -iox -g 500x500 -J set defaultdirectory "/dev/shm/portage/sci-mathematics/sage-9999/homedir/.sage/temp/localhost/6946/dir_jizvbnht/scene.spt.zip"
script SCRIPT
-j write PNG '/dev/shm/portage/sci-mathematics/sage-9999/homedir/.sage/temp/localhost/6946/dir_jizvbnht/preview.png'
times as many 3D plots probably. This is probably a bug in the java handling mechanism or something that need to be set.
Which branch are you using?
vbraun, so I am a bit ahead but I suspect the issue is outside sage-on-gentoo and it has actually been going for a while but this is the first time I have been affected as root. So I suspect this is a java configuration problem, I have seen similar issues in bugzilla, https://bugs.gentoo.org/762619 has a strikingly similar sandbox problem.
I tried to build the vbraun
branch and dev-python/cypari2-2.1.2
would not build. Problems locating a bunch of .pxd
files as
from __future__ import absolute_import, division, print_function
from cysignals.signals cimport sig_on, sig_off, sig_block, sig_unblock, sig_error
^
------------------------------------------------------------
cypari2/closure.pyx:36:0: 'cysignals/signals.pxd' not found
Did you do anything special? Perhaps I should just wait.
I'll need a more complete log. And no, I didn't need to do anything special.
Are you using just python3.8
? The failure here is with python3.9
. The build using python3.8
, I suppose for building in parallel, did not complete.
I have built with python 3.7, 3.8 and 3.9. It shouldn't build in parallel.
I tried separately with each python (3.8 and 3.9). It builds with python3.8
but not with python3.9
. The parallel build terminated because of the python3.9
failure and the build with python3.8
was not complete. I'll send the build log which is not very long.
I think cysignals
was not built for python3.9
!
That would explain it. But why did it not build it as a dependency. Looking at the ebuild.
Hum, dependencies in the cypari2 ebuilds are not correct and I am not sure why. I'll get it fixed shortly.
Dependency problems in cypari2 fixed. Inspecting others.
OK, embarrassing missing python dependencies have been fixed.
I've been able to build the vbraun
branch without issue.
I should mention that the html-docs
and pdf-docs
were also built.
Whatever I try, I still get sandbox violations around here. I wonder if I got cruft somewhere causing this.
I'm not using a ramdisk. I don't have enough ram. I wouldn't think that would be an issue?
That would be a new one. I am going to try openjdk to see if it behaves better.
Off topic, not JRE related:
I have one doctest failure with 9.3.beta6
(vbraun branch) that was not in 9.3.beta5
that seems odd
sage -t --long --random-seed=0 usr/lib/python3.8/site-packages/sage/libs/ecl.pyx # Timed out (and interrupt failed)
When run individually, it runs forever
and just fails.
Definitely odd.
Using --verbose
may help figure out where it stops.
I think the location of the failure varies. When doctesting I see
sage -t --long --random-seed=0 usr/lib/python3.8/site-packages/sage/libs/ecl.pyx
Timed out (and interrupt failed)
**********************************************************************
Tests run before process (pid=26592) timed out:
sage: from sage.libs.ecl import test_sigint_before_ecl_sig_on ## line 121 ##
sage: test_sigint_before_ecl_sig_on() ## line 122 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 126 ##
0
sage: from sage.libs.ecl import test_ecl_options ## line 143 ##
sage: test_ecl_options() ## line 144 ##
ECL_OPT_INCREMENTAL_GC = 0
ECL_OPT_TRAP_SIGSEGV = 1
ECL_OPT_TRAP_SIGFPE = 1
ECL_OPT_TRAP_SIGINT = 1
ECL_OPT_TRAP_SIGILL = 1
ECL_OPT_TRAP_SIGBUS = 1
ECL_OPT_TRAP_SIGPIPE = 1
ECL_OPT_TRAP_INTERRUPT_SIGNAL = 1
ECL_OPT_SIGNAL_HANDLING_THREAD = 0
ECL_OPT_SIGNAL_QUEUE_SIZE = 16
ECL_OPT_BOOTED = 1
ECL_OPT_BIND_STACK_SIZE = 8192
ECL_OPT_BIND_STACK_SAFETY_AREA = 1024
ECL_OPT_FRAME_STACK_SIZE = 2048
ECL_OPT_FRAME_STACK_SAFETY_AREA = 128
ECL_OPT_LISP_STACK_SIZE = 32768
ECL_OPT_LISP_STACK_SAFETY_AREA = 128
ECL_OPT_C_STACK_SIZE = 0
ECL_OPT_C_STACK_SAFETY_AREA = 32768
ECL_OPT_HEAP_SIZE = 4294967296
ECL_OPT_HEAP_SAFETY_AREA = 1048576
ECL_OPT_THREAD_INTERRUPT_SIGNAL = 36
ECL_OPT_SET_GMP_MEMORY_FUNCTIONS = 0
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 168 ##
0
sage: from sage.libs.ecl import * ## line 226 ##
sage: init_ecl() ## line 231 ##
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 235 ##
0
sage: from sage.libs.ecl import * ## line 321 ##
sage: from cysignals.tests import interrupt_after_delay ## line 322 ##
sage: ecl_eval("(setf i 0)") ## line 323 ##
<ECL: 0>
sage: inf_loop = ecl_eval("(defun infinite() (loop (incf i)))") ## line 325 ##
sage: interrupt_after_delay(1000) ## line 326 ##
sage: inf_loop() ## line 327 ##
**********************************************************************
I don't have the result when doctesting individually, but was at a different location.
It may be an obscure failure. I just tested again and the doctest passed
sage -t --long --warn-long 132.3 --random-seed=0 usr/lib/python3.8/site-packages/sage/libs/ecl.pyx
[204 tests, 1.74 s]
----------------------------------------------------------------------
All tests passed!
----------------------------------------------------------------------
Total time for all tests: 1.8 seconds
cpu time: 2.1 seconds
cumulative wall time: 1.7 seconds
It is strange.
Running the ecl.pyx
doctest a number of times it eventually hangs. When that happens I have several sage
processes
terry 7101 1 0 21:56 pts/25 00:00:00 /storage/strogdon/gentoo-rap/usr/bin/python3.8 /storage/strogdon/gentoo-rap/usr/lib/python-exec/python3.8/sage-cleaner
terry 7282 6719 1 21:57 pts/25 00:00:02 /storage/strogdon/gentoo-rap/usr/bin/python3.8 /storage/strogdon/gentoo-rap/usr/lib/python-exec/python3.8/sage-runtests --long --warn-long 132.3 --random-seed=0 usr/lib/python3.8/site-packages/sage/libs/ecl.pyx
terry 7284 7282 0 21:57 pts/25 00:00:00 [sage-cleaner] <defunct>
terry 7293 7282 1 21:57 pts/25 00:00:01 /storage/strogdon/gentoo-rap/usr/bin/python3.8 /storage/strogdon/gentoo-rap/usr/lib/python-exec/python3.8/sage-runtests --long --warn-long 132.3 --random-seed=0 usr/lib/python3.8/site-packages/sage/libs/ecl.pyx
Perhaps sage-cleaner
is not doing its job.
What about ecl
processes? There could be a lisp instance hanging for whatever reason.
It takes a number of tries before it hangs and I don't see any ecl
processes. Only that the doctest appears twice as above. And when it finally exits it is as above
This has all been with the current vbraun
branch. Is master
in sync with 9.3.beta6
?
Yes, master is in sync with 9.3.beta6 with the exception to the cypari2 dependency I think. Which shouldn't have any impact.
For the record, I have now enabled py3.9 in the vbraun branch and I am able to build the documentation as a user again. Hopefully that will stick when using emerge rather than ebuild. But that definitely stinks.
I just noticed that about an hour ago. I'm now trying to build sage using py3.9 on the master
branch in prefix. The html-docs are now building.
Still failing as root :(
With building openjdk
?
No just building sage, still those sandbox violations.
* ACCESS DENIED: mkdir: /var/lib/portage/home/.java/fonts
Not very clear was I. Was sage built with openjdk
or openjdk-bin
.
This time it was openjdk-bin
.
Sage and all docs build here with py3.9 on Prefix and on Gentoo as root. I'm using openjdk
.
If fonts need to be generated during the build of Sage where will they be located? I don't see any generated fonts here. I guess they could have been deleted.
In the normal process of things, HOME
is set to ${PORTAGE_BUILDDIR}/homedir
during a build. So it all disappear once merged. The issue I have is that java is using portage
's normal home instead of following the HOME
variable.
Do you have systemd on your gentoo system? And acct-group/portage
and acct-user/portage
?
I have lots of Jave Runtime errors when testing sog
9.3.beta5
. They were not present, as far as I know, with9.3.beta4
. I also have this with vanilla and it's been present there for some time. I hadn't noticed it because there was no apparent failure. In any event this is fairly new. The head ofhs_err_pidxxxxx.log