sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 480 forks source link

Sage debug version #16938

Closed simon-king-jena closed 10 years ago

simon-king-jena commented 10 years ago

When building sage with SAGE_DEBUG=yes, then it turns out that the debug and the cygwin64 patches to the Singular spkg are in conflict. Hence, the build fails. Let's fix it!

I suppose the fix will eventually be in a stable upstream release.

There is also an issue with the new m4ri: https://bitbucket.org/malb/m4ri/issue/59/assertion-sizeof-mzd_t-64-fails-on-32-bit

Depends on #17088

Upstream: Reported upstream. No feedback yet.

CC: @jpflori @vbraun @malb

Component: build

Keywords: debug singular cygwin64

Author: Simon King, Volker Braun, Jeroen Demeyer

Branch/Commit: c471edd

Reviewer: Jeroen Demeyer, Volker Braun

Issue created by migration from https://trac.sagemath.org/ticket/16938

simon-king-jena commented 10 years ago
comment:1

The function choose_patches() of singular's spkg-install removes most of the files in $SRC/omalloc/ if we are in debug mode. However, the cygwin64 patch tries to patch some files in that folder.

In addition to that, the cygwin64 patch says

+ifeq ($(SINGUNAME),x86_64-Win)
+SO_SUFFIX = dll
+MODULE_SUFFIX    = dll
+LIBSINGULAR_FLAGS = -shared
+LIBSINGULAR_LIBS = -lsingfac -lsingcf -lntl -lreadline -lgmp -lomalloc 
+endif
+

So, if one would try to build the debug version in 64 bit cygwin, then one had -lomalloc, but that would not work, since omalloc is disabled on purpose, for the debug version.

What to do? In principle, we could put an alternative version of the cygwin64 patch into the patches/conditional folder, and if we are in debug version, we could let choose_patches() replace the regular cygwin patch version by the conditional version.

I will try to solve the problem accordingly, however, I have no windows box. Hence, someone else needs to test if a debug version of singular works in cygwin.

simon-king-jena commented 10 years ago
comment:2

PS: Instead of omalloc, the debug version's memory manager is called xalloc. This is a hint for a debug version in cygwin: Perhaps it needs to say -lxalloc rather than -lomalloc?

simon-king-jena commented 10 years ago
comment:3

Unfortunately, one important debug patch, namely singular_xalloc.patch, fails to apply.

simon-king-jena commented 10 years ago
comment:4

Replying to @simon-king-jena:

Unfortunately, one important debug patch, namely singular_xalloc.patch, fails to apply.

The good news is that many of the failing patches went upstream!

simon-king-jena commented 10 years ago
comment:5

I modified the patches so that they apply. However, the built fails, as the function omalloc0 is used in one place, but is not defined when using xalloc instead of omalloc. I guess I have to ask Hans Schönemann for advice.

vbraun commented 10 years ago
comment:6

IMHO we shouldn't waste time on a Cygwin debug version. Thats for way down the road. If it doesn't work now, so be it.

simon-king-jena commented 10 years ago
comment:7

Replying to @vbraun:

IMHO we shouldn't waste time on a Cygwin debug version. Thats for way down the road. If it doesn't work now, so be it.

Cygwin itself is not in the game. Only the cygwin PATCH currently constitutes a problem. It is not a big deal to solve it: I modified the patches and spkg-install so that building the debug version starts (i.e., so that all patches and changes apply).

Meanwhile Hans gave me a hint on where to get a missing function. After fixing that, the debug version will hopefully build.

simon-king-jena commented 10 years ago
comment:8

With Hans' hint, building the debug version went a bit further. However, building still fails, with "cannot find -lomalloc_ndebug".

OOOOPS! Looking for -lomalloc_ndebug in our patches, I see that the sage_debug.patch looks like this:

diff -ru src/kernel/Makefile.in src.debug/kernel/Makefile.in
--- src/kernel/Makefile.in      2012-04-17 21:00:08.000000000 +0200
+++ src.debug/kernel/Makefile.in        2012-07-12 16:51:29.923359810 +0200
@@ -49,7 +49,7 @@
 CXXFLAGS       = @CXXFLAGS@ ${PIPE} 
 CXXTEMPLFLAGS  = @CXXTEMPLFLAGS@
 CPPFLAGS       = -I${srcdir} -I.. -I@prefix@  @CPPFLAGS@ 
-DEFS           = -DNDEBUG -DOM_NDEBUG -D@SING_UNAME@ @DEFS@
+DEFS           = -DOM_NDEBUG -D@SING_UNAME@ @DEFS@
 LDFLAGS                = @LDFLAGS@
 LD_DYN_FLAGS   = @LD_DYN_FLAGS@
 SFLAGS         = @SFLAGS@
@@ -64,7 +64,7 @@
 LIBS           = -lm @NEED_LIBS@ 
 else
 # for the 2-0-* versions under Windows, we don't need gdbm, readline and ncurses
-LIBS           = -lsingfac -lsingcf -lntl -lgmp -lreadline -lncurses -lomalloc_ndebug
+LIBS           = -lsingfac -lsingcf -lntl -lgmp -lreadline -lncurses -lomalloc
 #LIBS          = -lsingfac -lsingcf -lgmp
 endif
 MP_LIBS                = @MP_LIBS@

Isn't the diff file formatted wrongly? I am talking about the paths "src/kernel" and "src.debug/kernel", which do not exist.

simon-king-jena commented 10 years ago
comment:9

Concerning ndebug: There is quite a number of files in upstream/singular-3.1.6/latest/ mentioning -lomalloc_ndebug. Probably we should change all of them to only mention -lomalloc.

simon-king-jena commented 10 years ago

Branch: u/SimonKing/sage_debug_version

simon-king-jena commented 10 years ago
comment:11

I have pushed my changes, so that someone else may have a change to see how to solve the -lomalloc_ndebug problem.


New commits:

b426226Make it so that the patches to Singular apply with SAGE_DEBUG=yes
3f5ee73Define omalloc0
bd802dbSlight improvement of a patch
simon-king-jena commented 10 years ago

Commit: bd802db

vbraun commented 10 years ago
comment:12

Patches are applied with -p1, so the first path component is stripped off.

simon-king-jena commented 10 years ago
comment:13

Replying to @vbraun:

Patches are applied with -p1, so the first path component is stripped off.

OK, but I think it doesn't hurt that in the latest commit I change the path names according to what is done in the other patches (source path a/... versus target path b/...).

I got more Info from Hans. There is a debug version of omalloc, but it would still be omalloc. Hence, valgrind would not see details but would only see large pages allocated. So, he recommends to stick with xalloc (which is a compatibility layer on top of malloc).

If we would have Singular 4-x in Sage (we only have 3-1-6), xalloc would be part of the sources. Hence, all our current patching for the debug version wouldn't be needed.

As for -lomalloc_ndebug: Hans suggests to change the makefile so that a module libomalloc_ndebug.so rather than libomalloc.so is created. To my understanding, it would also allow to remove other parts of our patches, where we replace -lomalloc_ndebug by -lomalloc.

vbraun commented 10 years ago
comment:14

libm4ri fails its self-tests when debug is on, this is #16966.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from bd802db to 9c31893

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

9c31893Define a missing symbol and change the module name: Singular with xalloc builds
simon-king-jena commented 10 years ago

Author: Simon King

simon-king-jena commented 10 years ago
comment:16

Meanwhile I got Singular's xalloc version (which is better for valgrinding than the debug version of omalloc) to build. Now I am trying to build the rest of Sage, but expect that I will soon run into the issue tracked at #16966...

In the latest commit I added one missing function (which is empty in the debug version) and I changed the module name from libomalloc.so into libomalloc_ndebug.so. It may seem strange for someone that the DEBUG version is named ..._ndebug. If that's a problem then (s)he may change it, so that it still works...

I am trying to continue building, potentially with m4ri in non-debug mode. I guess for now one can say that it needs review.

simon-king-jena commented 10 years ago

Upstream: None of the above - read trac for reasoning.

vbraun commented 10 years ago
comment:17

For the record, m4ri builds fine if you don't set SAGE_CHECK=yes

vbraun commented 10 years ago
comment:18

I get /usr/bin/ld: cannot find -lomalloc when building from scratch...

simon-king-jena commented 10 years ago
comment:19

Replying to @vbraun:

I get /usr/bin/ld: cannot find -lomalloc when building from scratch...

It has built for me (without SAGE_CHECK=yes), but I suppose that libomalloc was present from previous attempts to build. So, let's try "make distclean" and try again...

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f4c6701Replace lomalloc by lomalloc_ndebug everywhere in debug version
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 9c31893 to f4c6701

simon-king-jena commented 10 years ago
comment:21

After make distclean I could reproduce the build errors.

The latest commit provides a solution. Perhaps you don't like it, for aesthetic reasons ("the debug version uses a module called omalloc_ndebug"), but it works. I also had to resolve a conflict with cygwin_64.patch to achieve it.

vbraun commented 10 years ago
comment:22

I'll give it a spin.

IMHO we don't have to win beauty contests with the debug version...

vbraun commented 10 years ago
comment:23

Still doesn't build for me:

g++ -O0 -g  -fPIC -I.. -I/home/vbraun/Sage/git-develop/local -pipe -I. -I.. -I/home/vbraun/Sage/git-develop/local -I/home/vbraun/Sage/git-develop/local/include -I/home/vbraun/Sage/git-develop/local/include   -fno-implicit-templates -I.. -I/home/vbraun/Sage/git-develop/local -Dx86_64_Linux -DHAVE_CONFIG_H \
  -o Singular \
  tesths.cc iparith.o mpsr_Tok.o claptmpl.o\
  grammar.o scanner.o attrib.o blackbox.o eigenval_ip.o extra.o fehelp.o feOpt.o ipassign.o ipconv.o ipid.o iplib.o ipprint.o ipshell.o newstruct.o lists.o sdb.o fglm.o interpolation.o silink.o ssiLink.o s_buff.o subexpr.o janet.o wrapper.o libparse.o sing_win.o gms.o pcv.o maps_ip.o walk.o walk_ip.o cntrlc.o misc_ip.o calcSVD.o pipeLink.o Minor.o MinorProcessor.o MinorInterface.o bigintm.o pyobject_setup.o denom_list.o minpoly.o countedref.o semaphore.o slInit_Dynamic.o -rdynamic -L/home/vbraun/Sage/git-develop/local/kernel -L../kernel -lkernel_g -L/home/vbraun/Sage/git-develop/local/lib  -L/home/vbraun/Sage/git-develop/local/lib -lflint -lmpfr -lmpir  -ldl -lm -lsingfac -lsingcf -lflint -lmpfr -lntl -lgmp -lreadline -ltermcap -lnsl -lm  -lnsl -lbsd  -lomalloc -lpthread  ../kernel/mmalloc.o 
/usr/bin/ld: cannot find -lomalloc
collect2: error: ld returned 1 exit status
make[1]: *** [Singular] Error 1
make[1]: Leaving directory `/home/vbraun/Sage/git-develop/local/var/tmp/sage/build/singular-3.1.6.p3/src/latest/Singular'
make: *** [install-nolns] Error 1
Unable to build and install Singular
Error building Singular (error in build_singular).
simon-king-jena commented 10 years ago
comment:24

Replying to @vbraun:

Still doesn't build for me:

Seriously? It did build for me, even though I did make distclean after making the makefile make omalloc_ndebug instead of omalloc.

g++ -O0 -g  -fPIC -I.. -I/home/vbraun/Sage/git-develop/local -pipe -I. -I.. -I/home/vbraun/Sage/git-develop/local -I/home/vbraun/Sage/git-develop/local/include -I/home/vbraun/Sage/git-develop/local/include   -fno-implicit-templates -I.. -I/home/vbraun/Sage/git-develop/local -Dx86_64_Linux -DHAVE_CONFIG_H \
  -o Singular \
  tesths.cc iparith.o mpsr_Tok.o claptmpl.o\...

The name tesths.cc suggests that this is for testing. So, did you build with SAGE_CHECK and SAGE_DEBUG together?

vbraun commented 10 years ago
comment:25

Are you using 32bit? I did not enable SAGE_CHECKS.

vbraun commented 10 years ago

Changed branch from u/SimonKing/sage_debug_version to u/vbraun/sage_debug_version

simon-king-jena commented 10 years ago

Changed commit from f4c6701 to 25b66c3

simon-king-jena commented 10 years ago
comment:27

Replying to @vbraun:

Are you using 32bit? I did not enable SAGE_CHECKS.

No, I am on 64 bit. Is the file you mentioned only built on 32 bit??


New commits:

25b66c3Switch yet another -lomalloc
simon-king-jena commented 10 years ago
comment:28

OK, the change sounds reasonable.

vbraun commented 10 years ago
comment:29

No, I'm on x86 too. The issue comes from the test "$ac_cv_sizeof_voidp" != 4; so I thought it might be a bitwidth thing.

simon-king-jena commented 10 years ago
comment:30

By the way: I did not run the full test suite yet (now I am rebuilding from scratch), but found a surprisingly high number of test errors, given that not so long time ago the debug version passed all tests.

vbraun commented 10 years ago
comment:31

Most failures seem to be from non-commutative stuff...

vbraun commented 10 years ago
comment:32

For example:

(gdb) break dReportError
Breakpoint 1 at 0x7fffcc699a71: file dError.c, line 28.
(gdb) cont
Continuing.

sage: F.<x,y,z> = FreeAlgebra(QQ, implementation='letterplace')
sage: I = F*[x*y+y*z,x^2+x*y-y*x-y^2]*F
sage: Q.<a,b,c> = F.quo(I); Q
Quotient of Free Associative Unital Algebra on 3 generators (x, y, z) over Rational Field by the ideal (x*y + y*z, x*x + x*y - y*x - y*y)
sage: a^3

Breakpoint 1, dReportError (fmt=0x7fffcc8b2679 "S_2_R[%d]=%d != T[%d].i_r=%d\n") at dError.c:28
28  dError.c: No such file or directory.
(gdb) bt
#0  dReportError (fmt=0x7fffcc8b2679 "S_2_R[%d]=%d != T[%d].i_r=%d\n") at dError.c:28
#1  0x00007fffcc5b860a in kTest_TS (strat=0x2d09920) at kutil.cc:861
#2  0x00007fffcc5b0598 in bbaShift (F=0x18d0638, Q=0x0, w=0x0, hilb=0x0, strat=0x2d09920, uptodeg=3, lV=3)
    at kstd2.cc:1724
#3  0x00007fffcc5a79ed in kStdShift (F=0x18d0638, Q=0x0, h=isHomog, w=0x0, hilb=0x0, syzComp=0, 
    newIdeal=0, vw=0x0, uptodeg=3, lV=3) at kstd1.cc:1912
#4  0x00007fffcc5b09ce in freegb (I=0x18d0638, uptodeg=3, lVblock=3) at kstd2.cc:1823
#5  0x00007fffcc4e2301 in jjSYSTEM (res=0x2d06588, args=0x18d0548) at extra.cc:1370
#6  0x00007fffcc4a06b5 in iiExprArithM (res=0x2d06588, a=0x18d0548, op=496) at iparith.cc:8321
#7  0x00007fffcb9c3eb9 in __pyx_f_4sage_4libs_8singular_8function_17KernelCallHandler_handle_call (
    __pyx_v_self=0x7fffb80d2880, __pyx_v_argument_list=0x7fffb80d1860, __pyx_optional_args=0x7fffffff7ef0)
    at build/cythonized/sage/libs/singular/function.cpp:12083
[...]
(gdb) cy bt
[...]
#78 0x00007ffff7cd7b2d in _reduce_() at /home/vbraun/Sage/git-develop/local/lib/python2.7/site-packages/sage/rings/quotient_ring_element.py:118
       118            self.__rep = I.reduce(self.__rep)
#83 0x00007fffb5ae04ce in reduce() at /home/vbraun/Sage/git-develop/src/sage/algebras/letterplace/letterplace_ideal.pyx:306
       306        def reduce(self, G):
#87 0x00007fffb5f90e45 in normal_form() at /home/vbraun/Sage/git-develop/src/sage/algebras/letterplace/free_algebra_element_letterplace.pyx:733
       733        def normal_form(self,I):
#93 0x00007fffb5ad9bb4 in groebner_basis() at /home/vbraun/Sage/git-develop/src/sage/algebras/letterplace/letterplace_ideal.pyx:187
---Type <return> to continue, or q <return> to quit---
       187        def groebner_basis(self, degbound=None):
#96 0x00007fffcb9c5095 in __call__() at /home/vbraun/Sage/git-develop/src/sage/libs/singular/function.pyx:1185
      1185        def __call__(self, *args, ring=None, bint interruptible=True, attributes=None):
#98 0x00007fffcb9ca76b in call_function() at /home/vbraun/Sage/git-develop/src/sage/libs/singular/function.pyx:1462
      1462                _res = self.call_handler.handle_call(argument_list, si_ring)
#99 0x00007fffcb9c3d15 in handle_call() at /home/vbraun/Sage/git-develop/src/sage/libs/singular/function.pyx:1093
      1093                iiExprArithM(res, argument_list.args, self.cmd_n)
#100 0x00007fffcc4a029d in iiExprArithM()
vbraun commented 10 years ago
comment:33

I can reproduce some failures just in Singular, so I've asked here: https://groups.google.com/d/msg/libsingular-devel/4fL9NKCVGw0/mlOd_vzNSG8J

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a38e790Fix dangling C pointer to Python temporary
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 25b66c3 to a38e790

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from a38e790 to 0802e93

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

1cd65a8Remove some failing Singular debug statements
0802e93Reduce size of doctest
vbraun commented 10 years ago
comment:36

I think some internal Singular checks that get enabled with NDEBUG are too strict, I've added a patch removing these. I also almost despaired in fast_map until I noticed that idTest incorrectly assumes that currRing is set.

Now make ptest passes.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 0802e93 to 37fda1d

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

37fda1dReduce scale of some more huge doctests
vbraun commented 10 years ago
comment:38

Computing the invariants of an order-480 group after an order-48 group really only checks that you have a fast computer.

This fixes the last timeout in the make ptestlong for me.

vbraun commented 10 years ago

Changed author from Simon King to Simon King, Volker Braun

vbraun commented 10 years ago

Changed upstream from None of the above - read trac for reasoning. to Reported upstream. No feedback yet.

simon-king-jena commented 10 years ago
comment:41

Hi Volker, I am not happy with the two commits to reduce size of doctests.

It is said in the doc that below we do the Fateman benchmark over the finite field of order 32003, but I guess with your changes the statement becomes wrong, it isn't the Fateman benchmark if the exponent is only 10 and not 20.

Also I don't see why a test that is marked "long" anyway should be removed just because the debug version slows it down too much.

By the way, how long does this test take in non-debug version? It is stated that it used to take 12 seconds, four years ago. And 12 seconds is totally fine for a long test, IMHO.