sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.33k stars 453 forks source link

Singular omalloc requires 8-byte alignment on SPARC #14429

Closed jdemeyer closed 11 years ago

jdemeyer commented 11 years ago

On Solaris SPARC:

buildbot@mark:~/build/sage/mark-1/mark_full/build/sage$ ./sage -t --gdb --long devel/sage/sage/crypto/mq/sr.py
exec gdb -x "$SAGE_LOCAL/bin/sage-gdb-commands" --args python "$SAGE_LOCAL/bin/sage-runtests" --serial --long --timeout=0 --stats_path=/home/buildbot/.sage/timings2.jso
n devel/sage/sage/crypto/mq/sr.py
GNU gdb (GDB) 7.5.1
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/buildbot/build/sage/mark-1/mark_full/build/sage-5.9.beta4/local/bin/python...done.
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
Running doctests with ID 2013-04-09-04-41-09-5eaf23f2.
Doctesting 1 file.
sage -t --long devel/sage/sage/crypto/mq/sr.py

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1 (LWP 1)]
0xf9f743ac in add_to_basis_ideal_quotient (h=0x4f170dc, c=0x4f16b38, ip=0x0) at tgb.cc:1518
1518      c->weighted_lengths[i] = pQuality (h, c, c->lengths[i]);
(gdb) list
1513      }
1514      else
1515        pNorm (h);
1516      pNormalize (h);
1517
1518      c->weighted_lengths[i] = pQuality (h, c, c->lengths[i]);
1519      c->gcd_of_terms[i] = got;
1520    #ifdef HAVE_BOOST
1521      c->states.push_back (dynamic_bitset <> (i));
1522
(gdb) print c->weighted_lengths
$5 = (wlen_type *) 0x4f4659c

The problem is that wlen_type is a 64-bit integer which is not correctly aligned to 8 bytes.

spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/singular-3-1-5.p7.spkg (diff)

upstream: http://www.singular.uni-kl.de:8002/trac/ticket/483

Upstream: Reported upstream. No feedback yet.

Component: porting: Solaris

Keywords: SPARC alignment SIGBUS omalloc

Author: Jeroen Demeyer

Reviewer: Volker Braun

Merged: sage-5.9.beta5

Issue created by migration from https://trac.sagemath.org/ticket/14429

jdemeyer commented 11 years ago

Author: Jeroen Demeyer

jdemeyer commented 11 years ago

Upstream: Reported upstream. No feedback yet.

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -39,3 +39,7 @@
 $5 = (wlen_type *) 0x4f4659c

The problem is that wlen_type is a 64-bit integer which is not correctly aligned to 8 bytes. + +spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/singular-3-1-5.p7.spkg + +upstream: http://www.singular.uni-kl.de:8002/trac/ticket/483

jdemeyer commented 11 years ago

Changed keywords from SPARC alignment SIGBUS to SPARC alignment SIGBUS omalloc

jdemeyer commented 11 years ago

Description changed:

--- 
+++ 
@@ -40,6 +40,6 @@

The problem is that wlen_type is a 64-bit integer which is not correctly aligned to 8 bytes.

-spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/singular-3-1-5.p7.spkg +spkg: http://boxen.math.washington.edu/home/jdemeyer/spkg/singular-3-1-5.p7.spkg (diff)

upstream: http://www.singular.uni-kl.de:8002/trac/ticket/483

jdemeyer commented 11 years ago
comment:3

Attachment: singular-3-1-5.p7.diff.gz

vbraun commented 11 years ago
comment:5

Sounds good to me.

vbraun commented 11 years ago

Reviewer: Volker Braun

jpflori commented 11 years ago
comment:7

FYI, from the errors I had reported at:

sage -t devel/sage/sage/interfaces/expect.py
**********************************************************************
File "devel/sage/sage/interfaces/expect.py", line 1089, in sage.interfaces.expec
t.Expect._crash_msg
Failed example:
    singular('2+3')
Exception raised:
    Traceback (most recent call last):
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/doctest/forker.py", line 460, in _run
        self.execute(example, compiled, test.globs)
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/doctest/forker.py", line 819, in execute
        exec compiled in globs
      File "<doctest sage.interfaces.expect.Expect._crash_msg[4]>", line 1, in <
module>
        singular('2+3')
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/interfaces/singular.py", line 724, in __call__
        return SingularElement(self, type, x, False)
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/interfaces/singular.py", line 1184, in __init__
        self._name = parent._create( value, type)
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/interfaces/singular.py", line 685, in _create
        self.set(type, name, value)
      File "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/python2.7/site-p
ackages/sage/interfaces/singular.py", line 628, in set
        raise TypeError, msg
    TypeError: [Errno 22] Invalid argument
    Error evaluating def sage51=2+3; in Singular
**********************************************************************
File "devel/sage/sage/interfaces/expect.py", line 1114, in sage.interfaces.expec
t.Expect._synchronize
Failed example:
    R.<x> = QQ[]; f = x^3 + x + 1;  g = x^3 - x - 1; r = f.resultant(g); gap(ZZ)
; singular(R)
Expected:
    Integers
    //   characteristic : 0
    //   number of vars : 1
    //        block   1 : ordering lp
    //                  : names    x
    //        block   2 : ordering C
Got:
    Integers
    Singular crashed -- automatically restarting.
    //   characteristic : 0
    //   number of vars : 1
    //        block   1 : ordering lp
    //                  : names    x
    //        block   2 : ordering C
**********************************************************************
2 items had failures:
   1 of   6 in sage.interfaces.expect.Expect._crash_msg
   1 of   3 in sage.interfaces.expect.Expect._synchronize
    [81 tests, 2 failures, 26.3 s]
----------------------------------------------------------------------
sage -t devel/sage/sage/interfaces/expect.py  # 2 doctests failed
----------------------------------------------------------------------
Total time for all tests: 27.1 seconds
    cpu time: 3.7 seconds
    cumulative wall time: 26.3 seconds

Note that running singular('2+3') in sage works fine and so does the second problematic doctest. I still get the numerical noise problem from

Oh and I still had troubles building singular because of C++ headers the system has in /usr/local/include which seem incompatible with the ones Sage's GCC ships, and this path is added by Singular in src/Singular/Makefile[.in] to CPPFLAGS if gcc version is greater than 4. You could argue that it is the computer install itself which is ill-configured, but when removing the -I/usr/local/include from Singular/Makefile[.in], everything builds fine. I've opened a ticket upstream but thanks to Singular serving their http server on a unusual port, i cannot access it at the moment.

jdemeyer commented 11 years ago
comment:8

The devel/sage/sage/interfaces/expect.py error is #14371.

jdemeyer commented 11 years ago
comment:9

Replying to @jpflori:

I still get the numerical noise problem from

  • sage/rings/polynomial/polynomial_element.pyx

Did you ever report this?

jpflori commented 11 years ago
comment:10

Nope, just on sage-devel, I'll open a Trac ticket.

jdemeyer commented 11 years ago
comment:11

Replying to @jpflori:

Oh and I still had troubles building singular because of C++ headers the system has in /usr/local/include

I guess the upstream ticket is http://www.singular.uni-kl.de:8002/trac/ticket/480, but that doesn't have enough information to debug the issue. A complete log file of the failed Singular build would be a good start.

jpflori commented 11 years ago
comment:12

Here you go:

But is there really any good reason to include /usr/local/include automagically? Is that standard practice?

jdemeyer commented 11 years ago
comment:13

Please do

g++ -save-temps -O2 -g  -fPIC   -fno-implicit-templates -I. -I.. -I/infres/post/flori/sage-5.9.beta3-infres2/local  -I/usr/xpg4/include -I/infres/post/flori/sage-5.9.beta3-infres2/local/include -I/infres/post/flori/sage-5.9.beta3-infres2/local/include  -I/usr/local/include  -DNDEBUG -DOM_NDEBUG -DSunOS_5 -DHAVE_CONFIG_H -c bigintmat.cc

from the directory containing bigintmat.cc (within a Sage shell) and send the file bigintmat.ii

jdemeyer commented 11 years ago
comment:14

Replying to @jpflori:

Is that standard practice?

I think it is. Why put stuff in /usr/local/include if you don't want it included?

jdemeyer commented 11 years ago
comment:15

Replying to @jpflori:

Part of it was posted here:

Yes, but that omitted the crucial information of which command caused those errors.

jpflori commented 11 years ago
comment:16

Replying to @jdemeyer:

Replying to @jpflori:

Is that standard practice?

I think it is. Why put stuff in /usr/local/include if you don't want it included?

I'd say you put stuff there intentionally (rather than just building a package and letting it install its headers by default into /usr/include) to get it included when you explicitly add "-I/usr/local/include". I don't know if any usual software install there by default (it seems this inclusion was done to please gcc > 4, but do gcc > 4 really install stuff there by default? maybe that was the install of the Singular dev which was put there...).

And it seems the problematic software on my system is Sun's gcc 3.4.3. I don't think that's Sun default install, but it rather looks like a hack from the sysadmin.

jpflori commented 11 years ago
comment:17

Replying to @jdemeyer:

Please do

g++ -save-temps -O2 -g  -fPIC   -fno-implicit-templates -I. -I.. -I/infres/post/flori/sage-5.9.beta3-infres2/local  -I/usr/xpg4/include -I/infres/post/flori/sage-5.9.beta3-infres2/local/include -I/infres/post/flori/sage-5.9.beta3-infres2/local/include  -I/usr/local/include  -DNDEBUG -DOM_NDEBUG -DSunOS_5 -DHAVE_CONFIG_H -c bigintmat.cc

from the directory containing bigintmat.cc (within a Sage shell) and send the file bigintmat.ii

Attached to this ticket.

jpflori commented 11 years ago

Attachment: bigintmat.ii.gz

jdemeyer commented 11 years ago
comment:18

Replying to @jpflori:

I don't know if any usual software install there by default

Almost all software installs stuff in /usr/local/include by default.

jpflori commented 11 years ago
comment:19

True enough for user installed things, but it seems distribution packages do rather install in /usr directly.

jdemeyer commented 11 years ago
comment:20

Replying to @jpflori:

True enough for user installed things, but it seems distribution packages do rather install in /usr directly.

True again.

jdemeyer commented 11 years ago
comment:21

After looking at attachment: bigintmat.ii, I would say the problem is your system configuration, not a Sage or Singular bug.

jpflori commented 11 years ago
comment:22

On Unix at least, gcc will look in /usr/local/include anyway: http://gcc.gnu.org/onlinedocs/cpp/Search-Path.html or http://gcc.gnu.org/onlinedocs/gcc-4.6.3/cpp/Search-Path.html#Search-Path; so that's another point for not changing anything in Singular.

Note that without the -I/usr/local/include, then the headers which get picked are in

/infres/post/flori/sage-5.9.beta3-infres2/local/lib/gcc/sparc-sun-solaris2.10/4.6.3/include-fixed/sys/feature_tests.h

This looks like a very nice directory to get headers from for the Sage's gcc... According to the above doc, adding an extra -I/usr/local/include should not change the search path, but it apparently does. And it seems that by default, the GCC Sage built does not look there (I ran "gcc -E -v -" and it does not show /usr/local/include, only /usr/include after three directorues under $SAGE_LOCAL).

gcc -E -v -
Reading specs from /usr/local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/specs
Configured with: /sfw10/builds/build/sfw10-patch/usr/src/cmd/gcc/gcc-3.4.3/configure --prefix=/usr/sfw --with-as=/usr/ccs/bin/as --without-gnu-as --with-ld=/usr/ccs/bin/ld --without-gnu-ld --enable-languages=c,c++ --enable-shared
Thread model: posix
gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath)
 /usr/local/packages/gcc3/bin/../libexec/gcc/sparc-sun-solaris2.10/3.4.3/cc1 -E -quiet -v -iprefix /usr/local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/ - -mcpu=v7
ignoring nonexistent directory "/usr/local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/../../../../sparc-sun-solaris2.10/include"
ignoring nonexistent directory "/usr/sfw/lib/gcc/sparc-sun-solaris2.10/3.4.3/../../../../sparc-sun-solaris2.10/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/packages/gcc3/bin/../lib/gcc/sparc-sun-solaris2.10/3.4.3/include
 /usr/local/include
 /usr/sfw/include
 /usr/sfw/lib/gcc/sparc-sun-solaris2.10/3.4.3/include
 /usr/include
End of search list.
gcc -E -v -
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/infres/post/flori/sage-5.9.beta3-infres2/local/libexec/gcc/sparc-sun-solaris2.10/4.6.3/lto-wrapper
Target: sparc-sun-solaris2.10
Configured with: ../src/configure --prefix=/infres/post/flori/sage-5.9.beta3-infres2/local --with-local-prefix=/infres/post/flori/sage-5.9.beta3-infres2/local --with-gmp=/infres/post/flori/sage-5.9.beta3-infres2/local --with-mpfr=/infres/post/flori/sage-5.9.beta3-infres2/local --with-mpc=/infres/post/flori/sage-5.9.beta3-infres2/local --with-system-zlib --disable-multilib --with-as=/usr/ccs/bin/as --with-ld=/usr/ccs/bin/ld
Thread model: posix
gcc version 4.6.3 (GCC) 
COLLECT_GCC_OPTIONS='-E' '-v' '-mcpu=v9'
 /infres/post/flori/sage-5.9.beta3-infres2/local/libexec/gcc/sparc-sun-solaris2.10/4.6.3/cc1 -E -quiet -v -D__sparcv8 - -mcpu=v9
ignoring nonexistent directory "/infres/post/flori/sage-5.9.beta3-infres2/local/lib/gcc/sparc-sun-solaris2.10/4.6.3/../../../../sparc-sun-solaris2.10/include"
ignoring duplicate directory "/infres/post/flori/sage-5.9.beta3-infres2/local/include"
  as it is a non-system directory that duplicates a system directory
#include "..." search starts here:
#include <...> search starts here:
 /infres/post/flori/sage-5.9.beta3-infres2/local/lib/gcc/sparc-sun-solaris2.10/4.6.3/include
 /infres/post/flori/sage-5.9.beta3-infres2/local/include
 /infres/post/flori/sage-5.9.beta3-infres2/local/lib/gcc/sparc-sun-solaris2.10/4.6.3/include-fixed
 /usr/include
End of search list.

On a Ubuntu install, with the system-wide gcc, it indeed looks in /usr/local/include before the corresponding /usr/lib/gcc/x86_64-linux-gnu/4.6/include-fixed (but after /usr/lib/gcc/x86_64-linux-gnu/4.6/include). In fact on the Solaris install, I guess the line correponding to /usr/local/include is $SAGE_LOCAL/include when I run "gcc -E -v -" which was defined so when Sage built GCC. And in particular, none of these behaviors agree with GCC online doc, or rather the /usr/local should be replaced by the prefix used when GCC was built. Indeed I have by default;

gcc -E -v -
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/jpflori/sage-5.7/local/libexec/gcc/ia64-unknown-linux-gnu/4.6.3/lto-wrapper
Target: ia64-unknown-linux-gnu
Configured with: ../src/configure --prefix=/home/jpflori/sage-5.7/local --with-local-prefix=/home/jpflori/sage-5.7/local --with-gmp=/home/jpflori/sage-5.7/local --with-mpfr=/home/jpflori/sage-5.7/local --with-mpc=/home/jpflori/sage-5.7/local --with-system-zlib --disable-multilib  
Thread model: posix
gcc version 4.6.3 (GCC) 
COLLECT_GCC_OPTIONS='-E' '-v'
 /home/jpflori/sage-5.7/local/libexec/gcc/ia64-unknown-linux-gnu/4.6.3/cc1 -E -quiet -v -
ignoring nonexistent directory "/home/jpflori/sage-5.7/local/lib/gcc/ia64-unknown-linux-gnu/4.6.3/../../../../ia64-unknown-linux-gnu/include"
ignoring duplicate directory "/home/jpflori/sage-5.7/local/include"
  as it is a non-system directory that duplicates a system directory
#include "..." search starts here:
#include <...> search starts here:
 /home/jpflori/sage-5.7/local/lib/gcc/ia64-unknown-linux-gnu/4.6.3/include
 /home/jpflori/sage-5.7/local/include
 /home/jpflori/sage-5.7/local/lib/gcc/ia64-unknown-linux-gnu/4.6.3/include-fixed
 /usr/include
End of search list.
gcc -E -v -
Using built-in specs.
Target: ia64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --disable-libssp --enable-objc-gc --with-system-libunwind --enable-checking=release --build=ia64-linux-gnu --host=ia64-linux-gnu --target=ia64-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-8) 
COLLECT_GCC_OPTIONS='-E' '-v'
 /usr/lib/gcc/ia64-linux-gnu/4.4.5/cc1 -E -quiet -v -
ignoring nonexistent directory "/usr/local/include/ia64-linux-gnu"
ignoring nonexistent directory "/usr/local/include"
ignoring nonexistent directory "/usr/lib/gcc/ia64-linux-gnu/4.4.5/../../../../ia64-linux-gnu/include"
ignoring nonexistent directory "/usr/include/ia64-linux-gnu"
#include "..." search starts here:
#include <...> search starts here:
 /usr/lib/gcc/ia64-linux-gnu/4.4.5/include
 /usr/lib/gcc/ia64-linux-gnu/4.4.5/include-fixed
 /usr/include
End of search list.

So the real question is, is adding this -I/usr/local/include really useful? Shouldn't it rather be a user decision to add it if needed? It seems most GCC are smart enough to correctly set their default include path anyway, not sure what the comment in Singular sources in Singular/Makefile.in is related to... I'd like to say that if it does nothing on most systems, and just break my system (although I agree its config is not perfect, but I cannot do anything about that), then it's useless, but that would narrow minded.

jpflori commented 11 years ago
comment:23

In fact the addition of -I and -L was meant for versions of gcc (or other compilers?) before gcc 3. It was then removed for gcc3 at https://github.com/Singular/Sources/commit/833e11faeedd415ba185a7a678e4dcbe674720c3#Singular/configure.in

jpflori commented 11 years ago
comment:24

So I guess it makes sense to omit it for gcc > 3 as well :)

jpflori commented 11 years ago
comment:25

It was seemingly added for GCC 2.95 at: https://github.com/Singular/Sources/commit/a70441f0dcfb30cac8902851f41619cefa564903

jpflori commented 11 years ago
comment:26

The reason for this was seemingly:

jdemeyer commented 11 years ago

Merged: sage-5.9.beta5

jdemeyer commented 11 years ago
comment:28

Jean-Pierre: you should probably mention this upstream: http://www.singular.uni-kl.de:8002/trac/ticket/480

jpflori commented 11 years ago
comment:29

It's planned, but I cannot do it during the day (without appropiate ssh tunnels and http proxies) and have a tendency to postpone it in the evening.