Macaulay2 / M2

The primary source code repository for Macaulay2, a system for computing in commutative algebra, algebraic geometry and related fields.
https://macaulay2.com
347 stars 232 forks source link

NumericalAlgebraicGeometry appears to have race conditions #151

Open tom111 opened 10 years ago

tom111 commented 10 years ago

During install when the examples for Numerical Algebraic Geometry run, from time to time I get a this result.

/var/tmp/portage/sci-mathematics/Macaulay2-1.7_pre/work/M2/StagingArea/x86_64-Linux-Gentoo/bin/M2-binary --silent --print-width 77 --stop --int --no-readline -q --no-randomize -e 'needsPackage("NumericalAlgebraicGeometry", FileName => "/var/tmp/portage/sci-mathematics/Macaulay2-1.7_pre/work/M2/Macaulay2/packages/NumericalAlgebraicGeometry.m2")' <"/var/tmp/portage/sci-mathematics/Macaulay2-1.7_pre/temp/M2-22850-0/0_regeneration_lp__List_rp.m2"

i1 : needsPackage "NumericalAlgebraicGeometry" [...] i8 : cs = regeneration I_* -- SIGSEGV -- stack trace, pid 4865: level 0 -- return addr: 0x00465ebf -- frame: 0x7fffeeb4cfe0 level 1 -- return addr: 0x02f4b000 -- frame: 0x00e53cc0 -- end stack trace

When I repeatedly run this on the shell, I get the error maybe one out of 20 times. I have also occasionally seen the following in the generated error file _regeneration_lp__List_rp.errors

i3 : regeneration F Duplicate large block deallocation

Which library could write this to the screen? Is it gc?

DanGrayson commented 10 years ago

That message is from libgc. I've rolled back the version number of mpfr to 3.0.1, which should fix it. I was hoping to be able to debug it, but I never succeeded in getting the same error twice in a row, and I don't know where the nondeterminism is coming from. We could leave the issue open until that gets figured out.

tom111 commented 10 years ago

I just read some old e-mails that we exchanged on this. There is issue #19 on which Anton and I concluded, that some of the examples were not supposed to work because they contained non-reduced schemes and the code was not supposed to handle those. I don't know if Anton ever replaced those examples. Anyway, I think this was a deterministic failure, so probably we have something else here. Have you tested that this doesn't occur with mfr-3.0.1? I hate the idea of forcing Gentoo users to downgrade mpfr :-1: Is the problem limited to NumericalAlgebraicGeometry? If so, I'd rather cut that package.

mikestillman commented 10 years ago

I don't like the idea of cutting that package (and the other packages related to it, including PHC, Bertini interfaces, and NumericalHilbertFunctions...) Why does this require other packages to downgrade? (is it not possible to have several versions co-existing?)

mikestillman commented 10 years ago

I don't think it is just an issue with NumericalAlgebraicGeometry. I have obtained crashes of M2 starting up (in debug mode, to be sure), before the initial prompt occurs. This is not reproducible. There must be some strange interaction between gc and mpfr... Dan: is some package initializing mpfr values before main() starts?

mikestillman commented 10 years ago

(to clarify: I have only seen these crashes with the "upgraded" mpfr)

DanGrayson commented 10 years ago

Mike, I have no idea what's going on. Thomas makes a good point, though: if we don't build mpfr we don't control the version number.

mikestillman commented 10 years ago

I'll make a stab at finding the bug too

tom111 commented 10 years ago

For daily work I use the master version which I recompile from time to time and it is linked against the new mpfr. I just compile and install this with IgnoreExampleErrors and I get a working version of M2. I've not seen any startup (or other) crashes although I'm using mpfr-3.1.2 for at least 6 months. I'm not using debug, though, I'm using the optimized version.

I just ran into this problem here today when compiling without "IgnoreExamplesErrors".

Anyway, I also don't have a good plan how to debug this.

DanGrayson commented 10 years ago

Mike, if you do try debugging it, the following commits try to make the behavior more deterministic. They include two new command line options useful to put in your .gdbinit as in my .gdbinit.dan

commit d603d60c6c4cdf4665cec8b3a98ab319d33498a4 commit 241269fb084ff8fdf72752bdbce37a52ac1fb3e1 commit 9c66f52c83e96cce757c85b10fb88652d06a4055 commit 30a15f0c7aa289436d3849bda9e4441182da75f6