sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.33k stars 453 forks source link

Bug with PARI interface gen.eval on Cygwin #21637

Closed embray closed 7 years ago

embray commented 7 years ago

With certain operating systems, certain compilers and certain compiler flags (in particular Cygwin with SAGE_DEBUG=yes), the PARI function is_universal_constant() can be broken. This leads to a 0 return value for closures returning gnil:

sage: pari("x -> print(x);")("hello")
hello
0

Upstream report: http://pari.math.u-bordeaux.fr/archives/pari-dev-1610/msg00006.html

This branch simply adds a patch from upstream.

Depends on #21582

Upstream: Fixed upstream, but not in a stable release.

Component: packages: standard

Keywords: pari windows cygwin

Author: Jeroen Demeyer

Branch: 0e7d231

Reviewer: Erik Bray

Issue created by migration from https://trac.sagemath.org/ticket/21637

jdemeyer commented 7 years ago
comment:1

Just FYI: the PARI value gnil is like Python's None. But it is really just a pointer to a "special" 0. It is indistinguishable from any other 0 except by comparing pointers.

jdemeyer commented 7 years ago
comment:2

Can you test this:

sage: pari("x -> print(x);")("hello")
hello
embray commented 7 years ago
comment:3

That test fails too in the same way.

embray commented 7 years ago
comment:4

Replying to @jdemeyer:

Just FYI: the PARI value gnil is like Python's None. But it is really just a pointer to a "special" 0. It is indistinguishable from any other 0 except by comparing pointers.

I see--that helps explain it. Maybe there's something odd going on with pointer comparison.

jdemeyer commented 7 years ago
comment:5

OK, next thing is to try GP itself:

$ ./sage --gp

and then

? (x -> print(x))("hello")
hello
jdemeyer commented 7 years ago
comment:6

Could you try this patch to PARI:

diff --git a/src/headers/pariinl.h b/src/headers/pariinl.h
index 2567d55..e37b435 100644
--- a/src/headers/pariinl.h
+++ b/src/headers/pariinl.h
@@ -1272,7 +1272,7 @@ INLINE void
 killblock(GEN x) { gunclone(x); }

 INLINE int
-is_universal_constant(GEN x) { return (x >= gen_0 && x <= ghalf); }
+is_universal_constant(GEN x) { return (x >= gnil && x <= ghalf); }

 /*******************************************************************/
 /*                                                                 */

Just copy this to some file inside build/pkgs/patches/, rebuild PARI (./sage -f pari) and test again.

jdemeyer commented 7 years ago

Description changed:

--- 
+++ 
@@ -16,17 +16,14 @@
             return P.new_gen(result)

-So it seems like result != gnil when it is expected to be. For example this test from the doctests is failing: +So it seems like result != gnil when it is expected to be. For example:

-sage: def printme(x):
-....:     print(x)
-....:
-sage: objtoclosure(printme)('matid(2)')
-[1, 0; 0, 1]
+sage: pari("x -> print(x);")("hello")
+hello
 0

-The final 0 is the return value from calling the closure. It should return None. This is especially odd because the logic in objtoclosure is that if the called Python function returns None the PARI function should return gnil. +The final 0 is the return value from calling the closure. It should return None.

I didn't have this bug a few weeks ago, and it's strange that I'm only seeing it on Cygwin.

embray commented 7 years ago
comment:7

Ok, I'll try that. Any idea why that patch?

jdemeyer commented 7 years ago
comment:8

Another potentially useful patch:

diff --git a/src/language/eval.c b/src/language/eval.c
index 4dba513..c0f015f 100644
--- a/src/language/eval.c
+++ b/src/language/eval.c
@@ -1590,7 +1590,14 @@ INLINE GEN
 closure_returnupto(GEN C)
 {
   pari_sp av=avma;
-  return copyupto(closure_return(C),(GEN)av);
+  printf("\navma = %p\n", (GEN)avma);
+  GEN z = closure_return(C);
+  printf("z = %p\n", z);
+  printf("is_universal_constant(z) = %i\n", is_universal_constant(z));
+  printf("gnil = %p\n", gnil);
+  GEN c = copyupto(z, (GEN)av);
+  printf("c = %p\n", z);
+  return c;
 }

 GEN
jdemeyer commented 7 years ago
comment:9

Replying to @embray:

Ok, I'll try that. Any idea why that patch?

PARI's handling for closure return values has special code to deal with "universal constants". I'm suspecting a problem there.

jdemeyer commented 7 years ago
comment:10

Never mind the patch from [comment:6], that's not the problem.

I found one place regarding constants which is undefined behaviour. To be sure that I am diagnosing the problem correctly, I need you to try this patch:

diff --git a/src/language/eval.c b/src/language/eval.c
index 4dba513..c7c4c62 100644
--- a/src/language/eval.c
+++ b/src/language/eval.c
@@ -1590,7 +1590,16 @@ INLINE GEN
 closure_returnupto(GEN C)
 {
   pari_sp av=avma;
-  return copyupto(closure_return(C),(GEN)av);
+  printf("\navma = %p\n", (GEN)avma);
+  GEN z = closure_return(C);
+  printf("z =     %p\n", z);
+  printf("is_universal_constant(z) = %i\n", is_universal_constant(z));
+  printf("gnil =  %p\n", gnil);
+  printf("gen_0 = %p\n", gen_0);
+  printf("ghalf = %p\n", ghalf);
+  GEN c = copyupto(z, (GEN)av);
+  printf("c =     %p\n", z);
+  return c;
 }

 GEN
jdemeyer commented 7 years ago
comment:11

Once you applied the patch from [comment:10], I need you to try

sage: pari("x -> print(x);")("hello")
jdemeyer commented 7 years ago
comment:12

Another thing to try (with or without the patches) in GP. You should get an error message here:

? install(gaffect, "vGG"); gaffect(0, print())
  ***   at top-level: install(gaffect,"vGG");gaffect(0,print())
  ***                                        ^------------------
  *** gaffect: bug in gaffect [overwriting universal object: gnil)], please report.
embray commented 7 years ago
comment:13

Actually, it appears the patch in [comment:6] fixed it. But I can try the other stuff if you think it needs a more general fix.

jdemeyer commented 7 years ago
comment:14

Replying to @embray:

Actually, it appears the patch in [comment:6] fixed it.

That's impossible :-)

I would be very interested in the debug output from the patch at [comment:10] with or without the other patch.

jdemeyer commented 7 years ago

Perhaps there is some non-determinism in the build? Maybe every time you build PARI, you have a certain chance of a broken build. That might explain these two things:

I didn't have this bug a few weeks ago, and it's strange that I'm only seeing it on Cygwin.

Actually, it appears the patch in [comment:6] fixed it.

embray commented 7 years ago
comment:16

FWIW Here's the output with the patch from [comment:10]

sage: pari("x -> print(x);")("hello")

avma = 0x6fff79c0000
hello
z =     0x2e01690
is_universal_constant(z) = 1
gnil =  0x2e01690
gen_0 = 0x2e01680
ghalf = 0x2e01700
c =     0x2e01690

but yeah, the other patch really did fix it. The only other possibility is that I built with SAGE_DEBUG=yes while previously I did not. If somehow SAGE_DEBUG turned out to be the difference I'll eat my socks :)

embray commented 7 years ago
comment:17

Though I guess is_universal_constant(gen_0) ought to return 1 if I'm reading the docs correctly, so I guess that patch is the wrong thing to do?

jdemeyer commented 7 years ago
comment:18

Replying to @embray:

but yeah, the other patch really did fix it.

Just to double-check, can you remove that patch and check again (with the debugging info from [comment:10])

jdemeyer commented 7 years ago
comment:19

Replying to @embray:

Though I guess is_universal_constant(gen_0) ought to return 1 if I'm reading the docs correctly, so I guess that patch is the wrong thing to do?

Yes, the patch is wrong and it shouldn't fix the problem.

embray commented 7 years ago
comment:20

Replying to @jdemeyer:

Replying to @embray:

but yeah, the other patch really did fix it.

Just to double-check, can you remove that patch and check again (with the debugging info from [comment:10])

Yeah, I'm doing that right now. Wanted to make sure not to drive either of us crazy.

embray commented 7 years ago
comment:21

Well the good news is, you were right that the little patch to is_universal_constant seems to have been irrelevant. The bad news is that I tried rebuilding with SAGE_DEBUG=no again and it broke again. The worse news is I like to try to keep my word (warning: gross picture) https://goo.gl/photos/oNiF1i6gLEfequY3A

The output I get is

avma = 0x6fff77c0000
[1, 0; 0, 1]
z =     0x2e86890
is_universal_constant(z) = 0
gnil =  0x2e86890
gen_0 = 0x2e86880
ghalf = 0x2e86850
c =     0x2e86890
0

For some reason with optimizations enabled ghalf winds up below gen_0.

embray commented 7 years ago
comment:22

By the way, it's really annoying that this logic assumes anything about the memory layout for static constants defined in C. There's no reason it should think it can make any such assumptions.

jdemeyer commented 7 years ago
comment:23

Replying to @embray:

By the way, it's really annoying that this logic assumes anything about the memory layout for static constants defined in C.

Of course, that is obviously a bug and I'm pretty sure that the PARI devs will accept that. At this point, we have enough information to report the issue upstream.

jdemeyer commented 7 years ago
comment:24

I will report it upstream.

jdemeyer commented 7 years ago

Upstream: Not yet reported upstream; Will do shortly.

embray commented 7 years ago
comment:25

I've realized I should never make logical assumptions.

embray commented 7 years ago

Changed upstream from Not yet reported upstream; Will do shortly. to none

jdemeyer commented 7 years ago

Description changed:

--- 
+++ 
@@ -1,22 +1,4 @@
-It seems that `gen.eval()` is returning `pari('0')` in cases where it should be returning `None`.  The relevant code is
-
-```cython
-        if t == t_CLOSURE:
-            if nkwds > 0:
-                raise TypeError("cannot evaluate a PARI closure using keyword arguments")
-            if closure_is_variadic(self.g):
-                arity = closure_arity(self.g) - 1
-                args = list(args[:arity]) + [0]*(arity-nargs) + [args[arity:]]
-            t0 = objtogen(args)
-            sig_on()
-            result = closure_callgenvec(self.g, t0.g)
-            if result == gnil:
-                P.clear_stack()
-                return None
-            return P.new_gen(result)
-```
-
-So it seems like `result != gnil` when it is expected to be.  For example:
+With certain operating systems, certain compilers and certain compiler flags (in particular Cygwin with `SAGE_DEBUG=yes`), the PARI function `is_universal_constant()` can be broken. This leads to a `0` return value for closures returning `gnil`:

sage: pari("x -> print(x);")("hello") @@ -24,6 +6,4 @@ 0


-The final `0` is the return value from calling the closure. It should return `None`.
-
-I didn't have this bug a few weeks ago, and it's strange that I'm only seeing it on Cygwin.
+**Upstream report**: http://pari.math.u-bordeaux.fr/archives/pari-dev-1610/msg00006.html
jdemeyer commented 7 years ago

Upstream: Reported upstream. No feedback yet.

jdemeyer commented 7 years ago

Branch: u/jdemeyer/bug_with_pari_interface_gen_eval_on_cygwin

jdemeyer commented 7 years ago

New commits:

7879f08Add PARI patch to fix is_universal_constant()
jdemeyer commented 7 years ago

Author: Jeroen Demeyer

jdemeyer commented 7 years ago

Commit: 7879f08

jdemeyer commented 7 years ago

Changed upstream from Reported upstream. No feedback yet. to Fixed upstream, but not in a stable release.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

fb8b1d2Add PARI patch to fix is_universal_constant()
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from 7879f08 to fb8b1d2

jdemeyer commented 7 years ago
comment:31

This is the patch merged in the upstream git repo: http://pari.math.u-bordeaux.fr/cgi-bin/gitweb.cgi?p=pari.git;a=commitdiff;h=e2ce19f (I removed the patch to CHANGES to avoid conflicts)

jdemeyer commented 7 years ago

Description changed:

--- 
+++ 
@@ -7,3 +7,5 @@

Upstream report: http://pari.math.u-bordeaux.fr/archives/pari-dev-1610/msg00006.html + +This branch simply adds a patch from upstream.

embray commented 7 years ago
comment:33

Great. Looking at the patch I can see now why maybe they did things they way they did originally, but this looks much better.

I need to fire up a new non-debug build in order to test this but I think this will fix it.

embray commented 7 years ago
comment:34

Meant to update on this sooner. I can confirm that the new patch fixes the issue as well.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Branch pushed to git repo; I updated commit sha1 and set ticket back to needs_review. New commits:

0e7d231Merge tag '7.4.rc0' into t/21637/bug_with_pari_interface_gen_eval_on_cygwin
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 7 years ago

Changed commit from fb8b1d2 to 0e7d231

jdemeyer commented 7 years ago
comment:36

Bumping the PARI version (there was another PARI ticket merged in rc0).

jdemeyer commented 7 years ago

Reviewer: Erik Bray

jdemeyer commented 7 years ago

Dependencies: #21582

vbraun commented 7 years ago

Changed branch from u/jdemeyer/bug_with_pari_interface_gen_eval_on_cygwin to 0e7d231

7822f248-ba56-45f1-ab3d-4de7482bdf9f commented 7 years ago

log of compilation of 7.5beta0 WITHOUT #20523 (okay)

7822f248-ba56-45f1-ab3d-4de7482bdf9f commented 7 years ago

Attachment: pari-2.8.0.alpha.p2.log

Log of compilation of 7.5beta0 WITH #20523 (problematic)

7822f248-ba56-45f1-ab3d-4de7482bdf9f commented 7 years ago

Changed commit from 0e7d231 to none