Closed kiwifb closed 7 years ago
Dependencies: #22895
numpy upgrade is on #22582.
Let's agree on who would work on debugging; say, in src/sage/rings/polynomial/polynomial_real_mpfr_dense.pyx
one would probably need to set an FPU trap to find exactly what triggers the numpy warning in np.float32('1.5')*polygen(RR)
(it's very likely to be a call to mpfr
, but which one?)
Or perhaps there should be a way to set this up globally, for all the cython modules.
(I'd be too busy in the coming week).
Changed dependencies from #22895 to #22582
Similarly, the warning is emitted in
sage: numpy.float64(5)>e
or >=
instead of >
, or pi
instead of e
. Note that pi.n()
and e.n()
are of type RR
, so again it points at the direction on mpfr
.
Description changed:
---
+++
@@ -58,4 +58,11 @@
sage: x=polygen(RR)
sage: numpy.float32('1.5') * x
-seen on freeBSD+clang, OS X+clang and linux+clang.
+seen on freeBSD+clang, OS X+clang and linux+clang.
+
+Similarly, the warning is emitted in
+
+ +sage: numpy.float64(5)>e +
+or >=
instead of >
, or pi
instead of e
. Note that pi.n()
and e.n()
are of type RR
, so again it points at the direction on mpfr
.
In spirit, it's pretty much the same as the polygen(RR)
problem. Let b=numpy.float64(5)
. Then
b.__gt__(e)
prints the same warning; under the hood it apparently calls mpfr
, which raises an FP flag, which then gets picked up by the warning printer.
Note that if I first call np.seterr(invalid='ignore')
then no warnings are printed.
it could help to configure MPFR with --enable-logging
(see details in doc/README.dev in the MPFR source repository):
For example, just define MPFR_LOG_ALL, run you program, and view `mpfr.log`.
Leads me to two observation. I accidentally rebuilt mpfr with gcc on my linux box (MPFR_CONFIGURE="--enable-logging" ./sage -f mpfr
) and ran all the test successfully. So mpfr+clang
seems to be the real cause of the trouble.
Once I remembered that on my linux machine I had to set CC
and CXX
I encountered another problem trying to compile with logging:
libtool: compile: clang -DMPFR_USE_LOGGING=1 -DTIME_WITH_SYS_TIME=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_LOCALE_H=1 -DHAVE_WCHAR_H=1 -DHAVE_STDARG=1 -DHAVE_SYS_TIME_H=1 -DHAVE_STRUCT_LCONV_DECIMAL_POINT=1 -DHAVE_STRUCT_LCONV_THOUSANDS_SEP=1 -DHAVE_ALLOCA_H=1 -DHAVE_STDINT_H=1 -DHAVE_VA_COPY=1 -DHAVE_SETLOCALE=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_LONG_LONG=1 -DHAVE_INTMAX_T=1 -DMPFR_HAVE_INTMAX_MAX=1 -DMPFR_HAVE_FESETROUND=1 -DHAVE_DENORMS=1 -DHAVE_SIGNEDZ=1 -DHAVE_ROUND=1 -DHAVE_TRUNC=1 -DHAVE_FLOOR=1 -DHAVE_CEIL=1 -DHAVE_NEARBYINT=1 -DHAVE_LDOUBLE_IEEE_EXT_LITTLE=1 -DHAVE_CLOCK_GETTIME=1 -DLT_OBJDIR=\".libs/\" -DHAVE_ATTRIBUTE_MODE=1 -DHAVE___GMPN_ROOTREM=1 -I. -I/home/fbissey/sandbox/git-fork/sage-clang/local/include -Wall -Wmissing-prototypes -Wpointer-arith -m64 -O2 -march=corei7-avx -mtune=corei7-avx -g -MT add.lo -MD -MP -MF .deps/add.Tpo -c add.c -fPIC -DPIC -o .libs/add.o
add.c:28:3: error: illegal storage class on function
MPFR_LOG_FUNC
^
./mpfr-impl.h:1716:3: note: expanded from macro 'MPFR_LOG_FUNC'
auto void __mpfr_log_cleanup (int *time); \
^
add.c:28:3: error: function definition is not allowed here
./mpfr-impl.h:1717:39: note: expanded from macro 'MPFR_LOG_FUNC'
void __mpfr_log_cleanup (int *time) { \
^
2 errors generated.
So we'll have to fix logging with clang before we can investigate with this tool. I will try on OS X shortly.
Hum equivalent error on OS X
libtool: compile: gcc -DMPFR_USE_LOGGING=1 -DTIME_WITH_SYS_TIME=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_LOCALE_H=1 -DHAVE_WCHAR_H=1 -DHAVE_STDARG=1 -DHAVE_SYS_TIME_H=1 -DHAVE_STRUCT_LCONV_DECIMAL_POINT=1 -DHAVE_STRUCT_LCONV_THOUSANDS_SEP=1 -DHAVE_ALLOCA_H=1 -DHAVE_STDINT_H=1 -DHAVE_VA_COPY=1 -DHAVE_SETLOCALE=1 -DHAVE_GETTIMEOFDAY=1 -DHAVE_LONG_LONG=1 -DHAVE_INTMAX_T=1 -DMPFR_HAVE_INTMAX_MAX=1 -DMPFR_HAVE_FESETROUND=1 -DHAVE_DENORMS=1 -DHAVE_SIGNEDZ=1 -DHAVE_ROUND=1 -DHAVE_TRUNC=1 -DHAVE_FLOOR=1 -DHAVE_CEIL=1 -DHAVE_NEARBYINT=1 -DHAVE_LDOUBLE_IEEE_EXT_LITTLE=1 -DHAVE_CLOCK_GETTIME=1 -DLT_OBJDIR=\".libs/\" -DHAVE_ATTRIBUTE_MODE=1 -DHAVE___GMPN_ROOTREM=1 -I. -I/Users/fbissey/build/sage-clang/local/include -Wall -Wmissing-prototypes -Wpointer-arith -m64 -O2 -march=corei7-avx -mtune=corei7-avx -g -MT exceptions.lo -MD -MP -MF .deps/exceptions.Tpo -c exceptions.c -fno-common -DPIC -o .libs/exceptions.o
In file included from exceptions.c:23:
./mpfr-impl.h:1557:4: error: "Logging not supported (needs gcc >= 3.0 and GNU C Library >= 2.0)."
# error "Logging not supported (needs gcc >= 3.0 and GNU C Library >= 2.0)."
^
1 error generated.
Unless you already tried this, I'd try re-running autoconf.
Replying to @dimpase:
Unless you already tried this, I'd try re-running autoconf.
You mean recreating mpfr's configure by running autoreconf
or something else altogether?
/* The following test on glibc is there mainly for Darwin (Mac OS X), to
obtain a better error message. The real test should have been a test
concerning nested functions in gcc, which are disabled by default on
Darwin; but it is not possible to do that without a configure test. */
# if defined (__cplusplus) || !(__MPFR_GNUC(3,0) && __MPFR_GLIBC(2,0))
# error "Logging not supported (needs gcc >= 3.0 and GNU C Library >= 2.0)."
seems to say that logging needs nested functions. And clang does not do them, as we know... I'd say this is an MPFR bug, no?
Replying to @dimpase:
/* The following test on glibc is there mainly for Darwin (Mac OS X), to obtain a better error message. The real test should have been a test concerning nested functions in gcc, which are disabled by default on Darwin; but it is not possible to do that without a configure test. */ # if defined (__cplusplus) || !(__MPFR_GNUC(3,0) && __MPFR_GLIBC(2,0)) # error "Logging not supported (needs gcc >= 3.0 and GNU C Library >= 2.0)."
seems to say that logging needs nested functions. And clang does not do them, as we know... I'd say this is an MPFR bug, no?
Yes. Using a GNU extension - the word bug can be argued, but it fails to adhere to the standard which you do at your own peril and the cost of portability.
One can of course do a log on gcc and hope that it's identical to what one would get on clang...
I'd say this is an MPFR bug, no?
it is in fact a clang bug:
https://bugs.llvm.org//show_bug.cgi?id=6378
"Clang doesn't support the GNU nested function extension, sorry. We have no plans to implement it."
Replying to @zimmermann6:
I'd say this is an MPFR bug, no?
it is in fact a clang bug:
https://bugs.llvm.org//show_bug.cgi?id=6378
"Clang doesn't support the GNU nested function extension, sorry. We have no plans to implement it."
Well, as MPFR makes no claims to adhere to a C standard, you may indeed consider it a feature :-)
for what's worth, this is the mpfr.log
I see on linux/gcc, after stripping the initialisation part, and running
np.float64(5).__gt__(e)
; looks like mpfr
is computing exp(1.0)
to certain precision.
> mpfr_exp:IN x[53]=1 rnd=3
> mpfr_const_log2_internal:IN rnd_mode=0
> mpfr_const_log2_internal:ZIV 1st prec=42
> mpfr_div:IN u[42]=2.2496e+21 v[42]=3.24549e+21 rnd=0
> mpfr_div:TIM 0ms
> mpfr_div:OUT q[42]=0.693147 inexact=-1
> mpfr_const_log2_internal:TIM 0ms
> mpfr_const_log2_internal:OUT x[32]=0.693147 inex=1
> mpfr_mul:IN b[32]=0.693147 c[64]=4.61169e+18 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[32]=3.19658e+18 inexact=1
> mpfr_sub_ui:IN x[64]=-4.61169e+18 u=2 rnd=0
> mpfr_sub:IN b[64]=-4.61169e+18 c[64]=2 rnd=0
> mpfr_sub:TIM 0ms
> mpfr_sub:OUT a[64]=-4.61169e+18
> mpfr_sub_ui:TIM 0ms
> mpfr_sub_ui:OUT y[64]=-4.61169e+18 inexact=0
> mpfr_mul:IN b[32]=0.693147 c[64]=-4.61169e+18 rnd=3
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[32]=-3.19658e+18 inexact=-1
> mpfr_exp_2:IN x[53]=1 rnd=3
> mpfr_const_log2_internal:IN rnd_mode=0
> mpfr_const_log2_internal:ZIV 1st prec=74
> mpfr_div:IN u[74]=2.17458e+40 v[74]=3.13725e+40 rnd=0
> mpfr_div:TIM 0ms
> mpfr_div:OUT q[74]=0.693147 inexact=-1
> mpfr_const_log2_internal:TIM 0ms
> mpfr_const_log2_internal:OUT x[64]=0.693147 inex=1
> mpfr_div:IN u[53]=1 v[64]=0.693147 rnd=0
> mpfr_div:TIM 0ms
> mpfr_div:OUT q[64]=1.4427 inexact=1
> mpfr_exp_2.114: d(x)=1.000000000000000000000000000000e+00 n=1
> mpfr_exp_2:ZIV 1st prec=78
> mpfr_exp_2.152: n=1 K=5 l=11 q=78 error_r=2
> mpfr_const_log2_internal:IN rnd_mode=0
> mpfr_const_log2_internal:ZIV 1st prec=90
> mpfr_div:IN u[90]=8.48887e+52 v[90]=1.22469e+53 rnd=0
> mpfr_div:TIM 0ms
> mpfr_div:OUT q[90]=0.693147 inexact=-1
> mpfr_const_log2_internal:TIM 0ms
> mpfr_const_log2_internal:OUT x[80]=0.693147 inex=1
> mpfr_exp_2.169:x[53]=1
> mpfr_exp_2.170:r[80]=0.693147
> mpfr_sub:IN b[53]=1 c[80]=0.693147 rnd=2
> mpfr_sub:TIM 0ms
> mpfr_sub:OUT a[80]=0.306853
> mpfr_exp_2.189:r[78]=0.306853
> mpfr_div_2ui:IN x[78]=0.306853 n=5 rnd=2
> mpfr_div_2ui:TIM 0ms
> mpfr_div_2ui:OUT y[78]=0.00958915 inexact=0
> mpfr_exp_2.202: l=270 q=78 (K+l)*q^2=1.673e+06
> mpfr_exp_2.219: before mult. by 2^n:
> mpfr_exp_2.220:s[80]=1.35914
> mpfr_exp_2.221: err=5 bits
> mpfr_mul_2si:IN x[80]=1.35914 n=1 rnd=3
> mpfr_mul_2si:TIM 0ms
> mpfr_mul_2si:OUT y[53]=2.71828 inexact=-1
> mpfr_exp_2:TIM 3ms
> mpfr_exp_2:OUT y[53]=2.71828 inexact=-1
> mpfr_exp:TIM 3ms
> mpfr_exp:OUT y[53]=2.71828 inexact=-1
> mpfr_exp:IN x[53]=1 rnd=2
> mpfr_mul:IN b[32]=0.693147 c[64]=4.61169e+18 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[32]=3.19658e+18 inexact=1
> mpfr_sub_ui:IN x[64]=-4.61169e+18 u=2 rnd=0
> mpfr_sub:IN b[64]=-4.61169e+18 c[64]=2 rnd=0
> mpfr_sub:TIM 0ms
> mpfr_sub:OUT a[64]=-4.61169e+18
> mpfr_sub_ui:TIM 0ms
> mpfr_sub_ui:OUT y[64]=-4.61169e+18 inexact=0
> mpfr_mul:IN b[32]=0.693147 c[64]=-4.61169e+18 rnd=3
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[32]=-3.19658e+18 inexact=-1
> mpfr_exp_2:IN x[53]=1 rnd=2
> mpfr_div:IN u[53]=1 v[64]=0.693147 rnd=0
> mpfr_div:TIM 0ms
> mpfr_div:OUT q[64]=1.4427 inexact=1
> mpfr_exp_2.114: d(x)=1.000000000000000000000000000000e+00 n=1
> mpfr_exp_2:ZIV 1st prec=78
> mpfr_exp_2.152: n=1 K=5 l=11 q=78 error_r=2
> mpfr_exp_2.169:x[53]=1
> mpfr_exp_2.170:r[80]=0.693147
> mpfr_sub:IN b[53]=1 c[80]=0.693147 rnd=2
> mpfr_sub:TIM 0ms
> mpfr_sub:OUT a[80]=0.306853
> mpfr_exp_2.189:r[78]=0.306853
> mpfr_div_2ui:IN x[78]=0.306853 n=5 rnd=2
> mpfr_div_2ui:TIM 0ms
> mpfr_div_2ui:OUT y[78]=0.00958915 inexact=0
> mpfr_exp_2.202: l=270 q=78 (K+l)*q^2=1.673e+06
> mpfr_exp_2.219: before mult. by 2^n:
> mpfr_exp_2.220:s[80]=1.35914
> mpfr_exp_2.221: err=5 bits
> mpfr_mul_2si:IN x[80]=1.35914 n=1 rnd=2
> mpfr_mul_2si:TIM 0ms
> mpfr_mul_2si:OUT y[53]=2.71828 inexact=1
> mpfr_exp_2:TIM 0ms
> mpfr_exp_2:OUT y[53]=2.71828 inexact=1
> mpfr_exp:TIM 0ms
> mpfr_exp:OUT y[53]=2.71828 inexact=1
> mpfr_mul:IN b[64]=52 c[77]=0.25 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[64]=13 inexact=0
> mpfr_mul:IN b[64]=52 c[77]=0.25 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[64]=13 inexact=0
> mpfr_add:IN b[53]=-5 c[53]=2.71828 rnd=3
> mpfr_add:TIM 0ms
> mpfr_add:OUT a[53]=-2.28172
> mpfr_add:IN b[53]=-5 c[53]=2.71828 rnd=2
> mpfr_add:TIM 0ms
> mpfr_add:OUT a[53]=-2.28172
> mpfr_mul:IN b[64]=52 c[77]=0.25 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[64]=13 inexact=0
> mpfr_mul:IN b[64]=52 c[77]=0.25 rnd=2
> mpfr_mul:TIM 0ms
> mpfr_mul:OUT a[64]=13 inexact=0
> mpfr_exp_2: Ziv failed 0.00% (0 bad cases / 2 calls)
> mpfr_const_log2_internal: Ziv failed 0.00% (0 bad cases / 3 calls)
indeed, MPFR is first computing exp(1)
with rounding towards -infinity, then again exp(1)
with rounding towards +infinity.
Then it computes twice 52*0.25
with rounding towards +infinity (I wonder why the same value is computed twice).
Then it adds -5
and 2.71828
with rounding towards -infinity and +infinity,
I guess this is to compare intervals for 5
and exp(1)
.
Again it computes twice 52*0.25
with the same rounding, I don't know why.
Replying to @dimpase:
Replying to @zimmermann6:
I'd say this is an MPFR bug, no?
it is in fact a clang bug:
https://bugs.llvm.org//show_bug.cgi?id=6378
"Clang doesn't support the GNU nested function extension, sorry. We have no plans to implement it."
Well, as MPFR makes no claims to adhere to a C standard, you may indeed consider it a feature :-)
As a person that has worked/ is working on rather exotic systems and do porting I consider it sad. It has been an extension for a rather a long time, if it was considered a useful or desirable feature it would be in the standard or planned for the next standard. That feature doesn't seem to have a big uptake either, we only had one package that absolutely required porting in sage so far.
Fortunately you don't use it in the functional part of mpfr but it is annoying.
My opinion is that while you make no claim to be standard compliant you should aim towards it and admit it as an issue, it does not have to be a show stopper bug.
My opinion is that while you make no claim to be standard compliant you should aim towards it and admit it as an issue, it does not have to be a show stopper bug.
feel free to report to the clang developers. It makes no sense to implement in MPFR a feature that should be implemented by the compiler.
Back to the issue, does np.float64(5).__gt__(e)
give the warning with clang?
Paul
anyway, to trace the function mpfr_exp
for example, you can apply the following patch (against the development version, but it should apply to 3.1.5 as well):
--- src/exp.c (revision 11456)
+++ src/exp.c (working copy)
@@ -42,10 +42,15 @@
int inexact;
MPFR_SAVE_EXPO_DECL (expo);
+#if 0
MPFR_LOG_FUNC
(("x[%Pu]=%.*Rg rnd=%d", mpfr_get_prec (x), mpfr_log_prec, x, rnd_mode),
("y[%Pu]=%.*Rg inexact=%d",
mpfr_get_prec (y), mpfr_log_prec, y, inexact));
+#else
+ mpfr_printf ("x[%Pu]=%.*Rg rnd=%d\n", mpfr_get_prec (x), 6, x, rnd_mode);
+ fflush (stdout);
+#endif
if (MPFR_UNLIKELY( MPFR_IS_SINGULAR(x) ))
{
@@ -185,5 +190,8 @@
}
}
+ mpfr_printf ("y[%Pu]=%.*Rg inexact=%d\n", mpfr_get_prec (y), 6, y, inexact);
+ fflush (stdout);
+
return mpfr_check_range (y, inexact, rnd_mode);
}
This will enable to see whether the warning occurs inside the mpfr_exp
call
(you might want to replace mpfr_printf(...)
by mpfr_fprintf (stderr, ...)
if the warning is printed to stderr, and change fflush(stdout)
into fflush(stderr)
).
Replying to @zimmermann6:
Back to the issue, does
np.float64(5).__gt__(e)
give the warning with clang?
Yes, at least on OS X:
sage: np.float16(5).__gt__(e)
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
True
sage: np.float32(5).__gt__(e)
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
True
sage: np.float64(5).__gt__(e)
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
True
sage: np.float128(5).__gt__(e)
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
True
Replying to @jhpalmieri:
Replying to @zimmermann6:
Back to the issue, does
np.float64(5).__gt__(e)
give the warning with clang?Yes, at least on OS X:
sage: np.float16(5).__gt__(e) /Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater #!/usr/bin/env python True sage: np.float32(5).__gt__(e) /Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater #!/usr/bin/env python True sage: np.float64(5).__gt__(e) /Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater #!/usr/bin/env python True sage: np.float128(5).__gt__(e) /Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater #!/usr/bin/env python True
Same on linux.
With the patch from comment:69:
sage: import numpy as np
sage: np.float128(5).__gt__(e)
x[53]=1 rnd=3
y[53]=2.71828 inexact=-1
x[53]=1 rnd=2
y[53]=2.71828 inexact=1
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta5/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
True
Any suggestions for other changes along the lines of comment:69 to help track down the problem?
I suppose anything short of actually setting up a watch on FPU bits will not help much. (sorry for slow response - I'm in single-parenting mode for a week :-))
from comment [comment:72] it seems the warning occurs after the two calls to mpfr_exp
.
Here is another patch to see whether it occurs in mpfr_mul
or mpfr_add
:
Index: src/add.c
===================================================================
--- src/add.c (revision 11456)
+++ src/add.c (working copy)
@@ -25,11 +25,16 @@
MPFR_HOT_FUNCTION_ATTR int
mpfr_add (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
{
+#if 0
MPFR_LOG_FUNC
(("b[%Pu]=%.*Rg c[%Pu]=%.*Rg rnd=%d",
mpfr_get_prec (b), mpfr_log_prec, b,
mpfr_get_prec (c), mpfr_log_prec, c, rnd_mode),
("a[%Pu]=%.*Rg", mpfr_get_prec (a), mpfr_log_prec, a));
+#else
+ printf ("enter mpfr_add\n");
+ fflush (stdout);
+#endif
if (MPFR_ARE_SINGULAR_OR_UBF (b, c))
{
@@ -100,23 +105,28 @@
MPFR_ASSERTD (MPFR_IS_PURE_FP (b));
MPFR_ASSERTD (MPFR_IS_PURE_FP (c));
+ int ret;
if (MPFR_UNLIKELY(MPFR_SIGN(b) != MPFR_SIGN(c)))
{ /* signs differ, it is a subtraction */
if (MPFR_LIKELY(MPFR_PREC(a) == MPFR_PREC(b)
&& MPFR_PREC(b) == MPFR_PREC(c)))
- return mpfr_sub1sp(a, b, c, rnd_mode);
+ ret = mpfr_sub1sp(a, b, c, rnd_mode);
else
- return mpfr_sub1(a, b, c, rnd_mode);
+ ret = mpfr_sub1(a, b, c, rnd_mode);
}
else
{ /* signs are equal, it's an addition */
if (MPFR_LIKELY(MPFR_PREC(a) == MPFR_PREC(b)
&& MPFR_PREC(b) == MPFR_PREC(c)))
- return mpfr_add1sp(a, b, c, rnd_mode);
+ ret = mpfr_add1sp(a, b, c, rnd_mode);
else
if (MPFR_GET_EXP(b) < MPFR_GET_EXP(c))
- return mpfr_add1(a, c, b, rnd_mode);
+ ret = mpfr_add1(a, c, b, rnd_mode);
else
- return mpfr_add1(a, b, c, rnd_mode);
+ ret = mpfr_add1(a, b, c, rnd_mode);
}
+
+ printf ("exit mpfr_add\n");
+ fflush (stdout);
+ return ret;
}
Index: src/mul.c
===================================================================
--- src/mul.c (revision 11456)
+++ src/mul.c (working copy)
@@ -688,6 +688,7 @@
mp_size_t bn, cn, tn, k, threshold;
MPFR_TMP_DECL (marker);
+#if 0
MPFR_LOG_FUNC
(("b[%Pu]=%.*Rg c[%Pu]=%.*Rg rnd=%d",
mpfr_get_prec (b), mpfr_log_prec, b,
@@ -694,6 +695,10 @@
mpfr_get_prec (c), mpfr_log_prec, c, rnd_mode),
("a[%Pu]=%.*Rg inexact=%d",
mpfr_get_prec (a), mpfr_log_prec, a, inexact));
+#else
+ printf ("enter mpfr_mul\n");
+ fflush (stdout);
+#endif
/* deal with special cases */
if (MPFR_ARE_SINGULAR (b, c))
@@ -1030,5 +1035,7 @@
rnd_mode = MPFR_RNDZ;
return mpfr_underflow (a, rnd_mode, sign);
}
+ printf ("exit mpfr_mul\n");
+ fflush (stdout);
MPFR_RET (inexact);
}
Side question: what routine of MPFR (if any) does np.float128(5).__gt__
call?
Replying to @zimmermann6:
from comment [comment:72] it seems the warning occurs after the two calls to
mpfr_exp
.
This is correct; there is no interrupt mechanics set that would make sure the warning printed immediately. The warning is printed after numpy
completes the task of computing the value of
np.float128(5).__gt__(e)
, before it returns the result.
Side question: what routine of MPFR (if any) does
np.float128(5).__gt__
call?
Numpy people told us some details here.
They say that behind the curtains it will try calling something like e.__lt__
, assuming e
is the argument (and all this happens within a compiled module written in (generated) C, making it hard to debug easily).
So we have two Python-based computer algebra systems not talking to each other too well...
my guess is the following:
(1) first intervals of MPFR values are computed that enclose 5 and exp(1)
(2) then those intervals are converted into the np.float128
type
(3) then the comparison is performed
I guess the warning occurs because a NaN was generated in step (2). It might be inside the mpfr_get_float128
function. Here is another patch to check:
Index: src/get_float128.c
===================================================================
--- src/get_float128.c (revision 11456)
+++ src/get_float128.c (working copy)
@@ -30,8 +30,15 @@
mpfr_get_float128 (mpfr_srcptr x, mpfr_rnd_t rnd_mode)
{
+ printf ("enter mpfr_get_float128\n");
+ fflush (stdout);
+
if (MPFR_UNLIKELY (MPFR_IS_SINGULAR (x)))
- return (__float128) mpfr_get_d (x, rnd_mode);
+ {
+ printf ("exit mpfr_get_float128: MPFR_IS_SINGULAR(x)\n");
+ fflush (stdout);
+ return (__float128) mpfr_get_d (x, rnd_mode);
+ }
else /* now x is a normal non-zero number */
{
__float128 r; /* result */
@@ -97,6 +104,8 @@
}
if (sign < 0)
r = -r;
+ printf ("exit mpfr_get_float128: normal case\n");
+ fflush (stdout);
return r;
}
}
I don't see get_float128.c
in version 3.1.5. I've tried to use similar patches in get_ld.c
, get_d64.c
, and get_float.c
, but none of the relevant functions print anything when I run np.float128(5).__gt__(e)
(or the same with np.float64
, etc.).
Replying to @zimmermann6:
my guess is the following:
(1) first intervals of MPFR values are computed that enclose 5 and exp(1)
(2) then those intervals are converted into the
np.float128
type
IMHO it is different (and isn't in so by the complete(?) trace in comment 65 above): numpy
has no way to convert MPFR numbers into numpy
numbers,
without asking Sage to do this. And it does not even know that Sage can do it. So it all happens on the level of Python data: numpy
knows that after getting "not implemented" from np.float128(5).__gt__(e)
it may try e.__lt__(np.float128(5))
.
And the latter invokes comparison in Sage, done with MPFR numbers.
(3) then the comparison is performed
Isn't the actual comparison is performed on MPFR numbers rather than on numpy
numbers?
As above
> mpfr_add:IN b[53]=-5 c[53]=2.71828 rnd=3
> mpfr_add:TIM 0ms
> mpfr_add:OUT a[53]=-2.28172
> mpfr_add:IN b[53]=-5 c[53]=2.71828 rnd=2
> mpfr_add:TIM 0ms
Isn't the actual comparison is performed on MPFR numbers rather than on numpy numbers?
yes it might be, since after the two mpfr_add
calls we should get an interval [u,v] where
exp(1)-5 lies. Then I guess Sage should check whether u > 0
or v < 0
. But this is possible via several MPFR functions (which are not logged through --enable-logging
). It could be
mpfr_cmp_ui (u, 0)
, or mpfr_cmp (u, zero)
since zero is predefined, or mpfr_sgn(u)
. One should add logging in those functions to see which one is called.
As far as I can tell, the last thing called (or at least the last thing called in which I've added logging) before numpy reports an error is mpfr_cmp3
.
As far as I can tell, the last thing called (or at least the last thing called in which I've added logging) before numpy reports an error is
mpfr_cmp3
.
please could you test with the following patch (against mpfr-3.1.5)?
--- cmp.c 2016-09-27 09:58:15.000000000 +0200
+++ /tmp/cmp.c 2017-05-12 08:32:54.914688069 +0200
@@ -35,6 +35,11 @@
mp_size_t bn, cn;
mp_limb_t *bp, *cp;
+ printf ("enter mpfr_cmp3\n");
+ printf ("b="); mpfr_dump (b);
+ printf ("c="); mpfr_dump (c);
+ printf ("s=%d\n", s);
+
s = MPFR_MULT_SIGN( s , MPFR_SIGN(c) );
if (MPFR_ARE_SINGULAR(b, c))
@@ -42,34 +47,59 @@
if (MPFR_IS_NAN (b) || MPFR_IS_NAN (c))
{
MPFR_SET_ERANGE ();
+ printf ("exit mpfr_cmp3: NaN case\n");
return 0;
}
else if (MPFR_IS_INF(b))
{
if (MPFR_IS_INF(c) && s == MPFR_SIGN(b) )
- return 0;
+ {
+ printf ("exit mpfr_cmp3: Inf1 case\n");
+ return 0;
+ }
else
- return MPFR_SIGN(b);
+ {
+ printf ("exit mpfr_cmp3: Inf2 case\n");
+ return MPFR_SIGN(b);
+ }
}
else if (MPFR_IS_INF(c))
- return -s;
+ {
+ printf ("exit mpfr_cmp3: Inf3 case\n");
+ return -s;
+ }
else if (MPFR_IS_ZERO(b))
- return MPFR_IS_ZERO(c) ? 0 : -s;
+ {
+ printf ("exit mpfr_cmp3: zero1 case\n");
+ return MPFR_IS_ZERO(c) ? 0 : -s;
+ }
else /* necessarily c=0 */
- return MPFR_SIGN(b);
+ {
+ return MPFR_SIGN(b);
+ printf ("exit mpfr_cmp3: zero2 case\n");
+ }
}
/* b and c are real numbers */
if (s != MPFR_SIGN(b))
- return MPFR_SIGN(b);
+ {
+ printf ("exit mpfr_cmp3: s != MPFR_SIGN(b)\n");
+ return MPFR_SIGN(b);
+ }
/* now signs are equal */
be = MPFR_GET_EXP (b);
ce = MPFR_GET_EXP (c);
if (be > ce)
- return s;
+ {
+ printf ("exit mpfr_cmp3: be > ce\n");
+ return s;
+ }
if (be < ce)
- return -s;
+ {
+ printf ("exit mpfr_cmp3: be < ce\n");
+ return -s;
+ }
/* both signs and exponents are equal */
@@ -82,18 +112,31 @@
for ( ; bn >= 0 && cn >= 0; bn--, cn--)
{
if (bp[bn] > cp[cn])
- return s;
+ {
+ printf ("exit mpfr_cmp3: bp[bn] > cp[cn]\n");
+ return s;
+ }
if (bp[bn] < cp[cn])
- return -s;
+ {
+ printf ("exit mpfr_cmp3: bp[bn] < cp[cn]\n");
+ return -s;
+ }
}
for ( ; bn >= 0; bn--)
if (bp[bn])
- return s;
+ {
+ printf ("exit mpfr_cmp3: bp[bn] > 0\n");
+ return s;
+ }
for ( ; cn >= 0; cn--)
if (cp[cn])
- return -s;
+ {
+ printf ("exit mpfr_cmp3: cp[bn] > 0\n");
+ return -s;
+ }
- return 0;
+ printf ("exit mpfr_cmp3: equal case\n");
+ return 0;
}
#undef mpfr_cmp
I get this:
enter mpfr_cmp3
b=-0
c=0.10011011111100001010100010110001010001010111011010010E1
s=1
exit mpfr_cmp3: zero1 case
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta4/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in less
#!/usr/bin/env python
I see no "less" comparison in the mpfr_cmp3
branch corresponding to the zero1
case. The invalid value encountered in less
warning might correspond to a comparison with NaN, but neither b nor c are NaN here.
What the the Python call that triggers this?
It used to be np.float64(5).__gt__(e)
, giving invalid value in greater
warning. Does the same call with patched MPFR give invalid value in less
warning?
I get warnings with both __gt__
and __lt__
; the one in comment:83 was from __lt__
. When I do np.float64('1.5').__gt__(e)
, the various logging messages end in this:
enter mpfr_cmp3
b=0.10011011111100001010100010110001010001010111011010100E1
c=0
s=1
enter mpfr_cmp3
b=-0
c=0.10011011111100001010100010110001010001010111011010010E1
s=1
exit mpfr_cmp3: zero1 case
/Users/jpalmier/Desktop/Sage_stuff/sage_builds/TESTING/sage-8.0.beta4/src/bin/sage-ipython:1: RuntimeWarning: invalid value encountered in greater
#!/usr/bin/env python
Replying to @zimmermann6:
I see no "less" comparison in the
mpfr_cmp3
branch corresponding to thezero1
case. Theinvalid value encountered in less
warning might correspond to a comparison with NaN, but neither b nor c are NaN here.
is there a "greater" comparison? This is the one that would correspond to __lt__
in the original call, as numpy people tell us.
is there a "greater" comparison?
there is no "greater" comparison either. The only comparisons are between exponents and words of the significand, but no such comparison occurs when one of the operands is zero.
And anyway, there is no double-precision NaN
in that function.
here is somewhat less involved way to trigger this, not involving running through numpy evaluation loop
import numpy as np
from ctypes import cdll
from ctypes.util import find_library
libm = cdll.LoadLibrary(find_library('m'))
print libm.fetestexcept(int(0x01)) # checks if FE_INVALID is set
bool(e.__lt__(np.float32('1.5')))
print libm.fetestexcept(int(0x01))
Running this on Linux/gcc produces
0
False
0
while on FreeBSD/clang I get
0
False
1
After I've found this, I decided to check whether merely importing numpy does something to the FPU flags on FreeBSD, and in fact it does! Namely, the output of
from ctypes import cdll
from ctypes.util import find_library
libm = cdll.LoadLibrary(find_library('m'))
print libm.fetestexcept(int(0x01))
import numpy
print libm.fetestexcept(int(0x01))
is
0
0
on Linux/gcc, and
1
0
on FreeBSD/clang! And in fact one can see that the FE_INVALID bit is flipped by
bool(e.__lt__(float('1.5')))
just as well:
sage: import numpy
....: from ctypes import cdll
....: from ctypes.util import find_library
....: libm = cdll.LoadLibrary(find_library('m'))
....: print libm.fetestexcept(int(0x01))
....: bool(e.__lt__(float('1.5')))
....: print libm.fetestexcept(int(0x01))
....:
0
False
1
on FreeBSD/clang (but the last 1 becomes 0 on Linux/gcc).
That is, we also need to debug Sage for a place that flips FE_INVALID!
$ ./sage --python
Python 2.7.13 (default, May 14 2017, 23:48:25)
[GCC 4.2.1 Compatible Clang 4.0.0 ] on freebsd11
Type "help", "copyright", "credits" or "license" for more information.
>>> from ctypes import cdll
>>> from ctypes.util import find_library
>>> libm = cdll.LoadLibrary(find_library('m'))
>>> print libm.fetestexcept(int(0x01))
0
>>> from sage.all import *
>>> print libm.fetestexcept(int(0x01))
1
I added a bunch of print ("TAG: {}".format(libm.fetestexcept(int(0x01))))
statements with various tags. The result changes from 0 to 1 in the file sage/libs/pynac/pynac.pyx
, at the line init_pynac_I()
. Within that function, it changes at the line
K = QuadraticField(-1, 'I', embedding=CC.gen(), latex_name='i')
Quadratic fields are constructed using UniqueFactory
, and the result changes from 0 to 1 in the try/except
block
cache_key = key
print ("020: {}".format(libm.fetestexcept(int(0x01))))
try:
try:
return self._cache[version, cache_key]
except TypeError: # key is unhashable
print ("030: {}".format(libm.fetestexcept(int(0x01))))
cache_key = _cache_key(cache_key)
return self._cache[version, cache_key]
except KeyError:
print ("040: {}".format(libm.fetestexcept(int(0x01))))
pass
in the get_object
method for UniqueFactory
in sage/structure/factory.pyx. With the print statements as indicated, I see
020: 0
040: 1
050: 1
So is it something to do with the cache? The cache is defined by
self._cache = sage.misc.weak_dict.WeakValueDictionary()
So is something going on with Sage's weak dictionaries?
There is also the possibility that I'm misinterpreting everything and the problem is somewhere else completely.
I don't get the need of nested try/except blocks in that fragment of the code, is it just some leftover?
I suppose it should be
try:
blah
except TypeError:
blah
except KeyError:
pass
?
Unless the last except
is also there to catch a KeyError
in the call to return self._cache[version, cache_key]
within the first except
clause.
what happens if you remove the except KeyError
part?
Probably one should try building fpectl Python module (not in Sage Python) and use it to locate where flags are raised during the import.
Replying to @dimpase:
Probably one should try building fpectl Python module (not in Sage Python) and use it to locate where flags are raised during the import.
good idea. I'm curious to see where a NaN is generated.
Seen with clang+OS X and freeBSD+clang
All these doctest fail because an unexpected warning is emitted:
More specifically, the warning is emitted by the call
seen on freeBSD+clang, OS X+clang and linux+clang.
Similarly, the warning is emitted in
or
>=
instead of>
, orpi
instead ofe
. Note thatpi.n()
ande.n()
are of typeRR
, so again it points at the direction onmpfr
.Depends on #22582
Upstream: Fixed upstream, but not in a stable release.
CC: @dimpase @jhpalmieri
Component: porting
Author: François Bissey, Dima Pasechnik, Paul Zimmermann
Branch/Commit:
fd29778
Reviewer: John Palmieri, Dima Pasechnik
Issue created by migration from https://trac.sagemath.org/ticket/22799