Closed zimmermann6 closed 11 years ago
Tracing things back, the conversion eventually gets done by:
mpq_get_d(self.value)
since Rational
is just a wrapper around an mpq
. This is where the weirdness seems to be.
ok, then the explanation is that mpq_get_d
(according to the GMP manual) rounds towards zero.
One could call mpz_get_d(numerator)/mpz_get(denominator)
but then for large numerator and denominator one would get three roundings, instead of only one when calling mpq_get_d
.
I propose to close this ticket.
Paul
So it's okay that
sage: RDF(1/10)*10 == RDF(1)
False
sage: RDF(1/10)*10 - RDF(1)
-1.11022302463e-16
sage: RR(1/10)*10 == RR(1)
True
sage: sage: RR(1/10)*10 - RR(1)
0.000000000000000
as Thierry pointed out here? Just asking. Or is it RR that is behaving sub-optimally in this case? I assume that three roundings is ordinarily worse than one - my apologies for asking dumb questions.
RR
is not the same as RDF
in regards to how rounding is done, so I would say yes, it is okay. I'd suspect if you change the rounding mode, you should get a similar result, but rounding and computations in RDF
are tied to your hardware (up to a certain precision) since I believe they are done in your native machine double
type. In particular, the ALU (arithmetic logic unit) of whatever CPU you're using. I should state for the record I'm not 100% certain of this.
The reason why RR
works uniformly is as it was noted in the ask sage, it is emulating a CPU in a program (one can think of it as constant hardware).
However instead of closing this ticket, I propose someone should implement a tutorial and/or improve the documentation and/or sage basics. I can do it if needbe.
Whoops, didn't mean to change the milestone.
As Paul explained, the rounding is the same for RR and RDF: RDF rounds its computations to the nearest.
Actually, the problem is somewhere else: the conversion from QQ to RDF rounds towards zero, which is not consistent with the general behaviour of RDF.
IMHO, this conversion from QQ to RDF should be fixed, if possible.
Ah I see. I misread/misinterpreted Paul's explanation. I also agree that this should be fixed for consistency:
sage: RDF(RDF(1)/10) - RDF(1)/RDF(10)
0.0
sage: RDF(RR(1)/10) - RDF(1)/RDF(10)
0.0
sage: RDF(1/RDF(10)) - RDF(1)/RDF(10)
0.0
sage: RDF(1/float(10)) - RDF(1)/RDF(10)
0.0
Karl-Dieter,
Replying to @kcrisman:
So it's okay that
sage: RDF(1/10)*10 == RDF(1) False sage: RDF(1/10)*10 - RDF(1) -1.11022302463e-16 sage: RR(1/10)*10 == RR(1) True sage: sage: RR(1/10)*10 - RR(1) 0.000000000000000
as Thierry pointed out here? Just asking. Or is it RR that is behaving sub-optimally in this case? I assume that three roundings is ordinarily worse than one - my apologies for asking dumb questions.
what is annoying is that RDF and RR give different results. The fact that RR gives 0.0
in that case
is not that important, for example:
sage: RR(1/49)*49-RR(1)
-1.11022302462516e-16
Paul
IMHO, this conversion from QQ to RDF should be fixed, if possible.
it is possible: first call the MPFR function mpfr_set_q
on the rational fraction
(which is what RR(p/q)
does) then convert back to double
with mpfr_get_d
, but this might be less efficient than the current code, since it would convert from QQ to RR, then from RR to RDF.
Converting separately the numerator and the denominator to RDF, then dividing in RDF does not work,
due to double rounding.
Consider for example the fraction p/q
where p=2403806706169061971
and q=983883817941434958
. If you first round p and q to RDF (say with mpz_get_d
), then you get
2403806706169061888/983883817941435008
, which is rounded to nearest to
5501555564164225/2251799813685248
, whereas the direct rounding of p/q to nearest gives
2750777782082113/1125899906842624
:
sage: p=2403806706169061971; q=983883817941434958; pp=RR(p); qq=RR(q)
sage: (pp/qq).exact_rational()
5501555564164225/2251799813685248
sage: RR(p/q).exact_rational()
2750777782082113/1125899906842624
Paul
RDF is for when you want your computations to be fast, at the expense of a little bit of accuracy/less control of rounding/platform dependence. As such, I think different behavior than RR is completely acceptable. It would be nice if mpq_get_d
rounded towards nearest, but until someone implements that in a manner that's at least comparable in speed to mpq_get_d
I think the current behavior is more in line with the philosophy of RDF than making things slow to get the last bit correct.
Robert, would the following be ok in what concerns speed?
Let p/q the fraction to be converted to RDF, I assume p, q > 0.
1) multiply p or q by a power of 2 into pp and qq so that both have the same number of bits (using mpz_sizeinbase and mpz_mul_2exp from GMP)
2) if pp < qq, multiply pp by 254, otherwise multiply pp by 253, using mpz_mul_2exp
3) now 253 <= pp/qq < 254, compute (using mpz_tdiv_qr) the quotient r = trunc(pp/qq) and the remainder s; we know that r has exactly 54 bits
4) if r is even, return r*2k where k is the normalizing constant (exact since r is exact on 53 bits)
5) if s is not zero, return (r+1)*2k [r+1 is even, and exact on 53 bits, even in the case where r+1 = 254]
5a) if s=0, return (r-1)2k if the bit of weight 1 of r is 0, otherwise (r+1)2k
I guess mpq_get_d does basically steps 1-4, thus it should be comparable in speed.
Paul
here is some tentative code which converts from QQ to RDF with rounding to nearest.
In the usual case where both the numerator and the denominator are less than 253 in absolute value, it is about twice as fast as mpq_get_d
.
double
mpq_get_d_nearest (mpq_t q)
{
mpz_ptr a = mpq_numref (q);
mpz_ptr b = mpq_denref (q);
size_t sa = mpz_sizeinbase (a, 2);
size_t sb = mpz_sizeinbase (b, 2);
size_t na, nb;
mpz_t aa, bb;
double d;
/* easy case: |a|, |b| < 2^53, no overflow nor underflow can occur */
if (sa <= 53 && sb <= 53)
return mpz_get_d (a) / mpz_get_d (b);
/* same if a = m*2^e with m representable on 53 bits, idem for b, but beware
that both a and b do not give an overflow */
na = sa - mpz_scan1 (a, 0);
nb = sb - mpz_scan1 (b, 0);
if (sa <= 1024 && na <= 53 && sb <= 1024 && nb <= 53)
return mpz_get_d (a) / mpz_get_d (b);
/* hard case */
mpz_init (aa);
mpz_init (bb);
if (sa >= sb)
{
mpz_set (aa, a);
mpz_mul_2exp (bb, b, sa - sb);
}
else
{
mpz_mul_2exp (aa, a, sb - sa);
mpz_set (bb, b);
}
/* q = aa/bb*2^(sa-sb) */
if (mpz_cmpabs (aa, bb) >= 0)
{
mpz_mul_2exp (bb, bb, 1);
sa ++;
}
mpz_mul_2exp (aa, aa, 54);
sb += 54;
mpz_tdiv_qr (aa, bb, aa, bb);
/* the quotient aa should have exactly 54 bits */
if (mpz_tstbit (aa, 0) == 0)
{
}
else if (mpz_cmp_ui (bb, 0) != 0)
{
if (mpz_sgn (aa) > 0)
mpz_add_ui (aa, aa, 1);
else
mpz_sub_ui (aa, aa, 1);
}
else /* mid case: round to even */
{
if (mpz_tstbit (aa, 1) == 0)
{
if (mpz_sgn (aa) > 0)
mpz_sub_ui (aa, aa, 1);
else
mpz_add_ui (aa, aa, 1);
}
else
{
if (mpz_sgn (aa) > 0)
mpz_add_ui (aa, aa, 1);
else
mpz_sub_ui (aa, aa, 1);
}
}
mpz_clear (aa);
mpz_clear (bb);
d = mpz_get_d (aa); /* exact */
return ldexp (d, (long) sa - (long) sb);
}
However I don't know where to incorporate that code in Sage. Could someone help?
Paul
Volker points to here in sage/rings/rational.pyx. I imagine that one could just replace that return mpq_get_d(self.value)
with your code, or maybe make an auxiliary function since I suspect yours is pure C or C++ and this is a Cython file.
That looks like pure C code. Isn't there an mpz/mpq package with the underlying C(++) code, and wouldn't this go in there...?
yes this is pure C code.
Paul
note that we have currently:
sage: RDF(2^(-1075))
0.0
sage: RDF(-2^(-1075))
0.0
The last result should be -0.0
, as in -RDF(2^(-1075))
.
I guess my code above fixes that (not tested).
Paul
Is it really worth to do
/* same if a = m*2^e with m representable on 53 bits, idem for b, but beware
that both a and b do not give an overflow */
na = sa - mpz_scan1 (a, 0);
nb = sb - mpz_scan1 (b, 0);
if (sa <= 1024 && na <= 53 && sb <= 1024 && nb <= 53)
return mpz_get_d (a) / mpz_get_d (b);
I think that this case is pretty rare, so I wouldn't special-case it.
Jeroen,
I think that this case is pretty rare, so I wouldn't special-case it.
this was my first idea. Then I realized I was often using say RDF(1/2^100)
.
Yes we can remove that special case, but it would be interesting to see how often it is used in the whole Sage test suite.
Paul
There is also this obvious bug:
mpz_clear (aa);
[...]
mpz_get_d (aa);
There is also this obvious bug:...
sorry, please replace the last lines by:
d = mpz_get_d (aa); /* exact */
mpz_clear (aa);
mpz_clear (bb);
return ldexp (d, (long) sa - (long) sb);
I also realize in the "hard case", there is a double-rounding issue for subnormals. This could be fixed if needed.
Paul
Author: Paul ZImmermann, Jeroen Demeyer
I'm working on a Sage patch based on Paul's code.
Changed author from Paul ZImmermann, Jeroen Demeyer to Paul Zimmermann, Jeroen Demeyer
I'm working on a Sage patch based on Paul's code.
great! Please ask if you need some help. My code should deal correctly with underflow or overflow (through the call to ldexp).
Paul
Jeroen,
as an author I'm not supposed to review that ticket, but I have a few questions:
why the or A.condition()
in matrix/matrix_double_dense.pyx
?
in matrix/matrix_mod2e_dense.pyx
I wonder why you had to change only one value,
and not all from lines 788 to 811
in plot/colors.py
the new value is closer to 2/7 than the old one (as expected):
sage: (RR(rainbow(7, 'rgbtuple')[5][0]).exact_rational()-2/7)*1.0
-1.26882631385732e-16
rings/contfrac.py
:sage: (RR(float(a)).exact_rational()+17/389)*1.0
-1.60539961787057e-18
the test float(1/10) * 10 == float(1)
is misleading: it might make think that this is now true for any value of q instead of 10, which is wrong (consider q=49).
efficiency is not that bad in the "easy" case (was 4.8 ms before):
sage: l=[a/b for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 2.97 ms per loop
sage: l=[(2^53+a)/(3^53+b) for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 7.38 ms per loop
q0 = aa/bb / 2^shift
should be q0 = a/b / 2^shift
|d| <= 2^-1075
should be |d| < 2^-1075
shift >= 972
can be changed to shift >= 971
, since for 971 we get |d| >= 2^1024
and 2^1024
gives infinity
maybe we can avoid the copy done by mpz_init_set
and point directly to the corresponding numerator or denominator?
I would add an example showing that the conversion now agrees with RR:
sage: all([RDF(a/b) == RR(a/b) for a in [1..99] for b in [1..99]])
True
I see you have dealt with the subnormal case too: great!
Paul
Shouldn't we push that upstream to MPIR/GMP? From a cursory Google search it seems that we are not the first ones to trip over this. Did anybody contact upstream for their opinion?
Volker, you are right. I will ask the GMP developers.
Paul
Replying to @zimmermann6:
Jeroen,
as an author I'm not supposed to review that ticket
I don't think that's true. I read your patch, understood it (so that's a review of your code) and made some modifications. So I think it is perfectly fine if you review the patch.
- why the
or A.condition()
inmatrix/matrix_double_dense.pyx
?
That's a cool doctest trick I learned recently. If the condition fails, it will output the exact value of A.condition()
instead of simply returning False
.
- in
matrix/matrix_mod2e_dense.pyx
I wonder why you had to change only one value, and not all from lines 788 to 811
Well, I assume the others are cases where the old and new mpq->double conversion code agree.
- the test
float(1/10) * 10 == float(1)
is misleading: it might make think that this is now true for any value of q instead of 10, which is wrong (consider q=49).
OK, true. I will remove that test.
- maybe we can avoid the copy done by
mpz_init_set
and point directly to the corresponding numerator or denominator?
I have no idea how to do that in Cython. Besides, we do change the value of aa
and bb
, so we need an init anyway.
- I would add an example showing that the conversion now agrees with RR:
sage: all([RDF(a/b) == RR(a/b) for a in [1..99] for b in [1..99]]) True
OK, good idea. I'll also test negative numerators.
Jeroen,
as an author I'm not supposed to review that ticket
I don't think that's true. I read your patch, understood it (so that's a review of your code) and made some modifications. So I think it is perfectly fine if you review the patch.
I'm fine in giving comments, but I would prefer someone else gives "positive review".
- why the
or A.condition()
inmatrix/matrix_double_dense.pyx
?That's a cool doctest trick I learned recently. If the condition fails, it will output the exact value of
A.condition()
instead of simply returningFalse
.
good. I learned something too!
- in
matrix/matrix_mod2e_dense.pyx
I wonder why you had to change only one value, and not all from lines 788 to 811Well, I assume the others are cases where the old and new mpq->double conversion code agree.
when I did those tests by hand in a vanilla session, I got different values, thus I guess they depend on the seed value, and the previous tests, which is not very good.
- maybe we can avoid the copy done by
mpz_init_set
and point directly to the corresponding numerator or denominator?I have no idea how to do that in Cython. Besides, we do change the value of
aa
andbb
, so we need an init anyway.
you could have different variables to store the quotient and remainder, and split the code to perform the division directly on a or b.
Paul
Attachment: 14416_QQ_to_RDF.patch.gz
Replying to @zimmermann6:
you could have different variables to store the quotient and remainder, and split the code to perform the division directly on a or b.
Yes, that's what I did in the new patch.
with the new patch (I did not check with the old one) I get warnings:
warning: sage/rings/../ext/gmp.pxi:65:22: local variable 'p1' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:66:15: local variable 'p2' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:87:22: local variable 'p1' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:173:14: local variable 'g' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:173:27: local variable 's' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:173:40: local variable 't' referenced before assignment
warning: sage/rings/../ext/gmp.pxi:173:54: local variable 'mn' referenced before assignment
warning: sage/rings/rational.pyx:1433:25: local variable 'prod' referenced before assignment
warning: sage/rings/rational.pyx:1444:29: local variable 'prod' referenced before assignment
warning: sage/rings/rational.pyx:1454:33: local variable 'prod' referenced before assignment
warning: sage/rings/rational.pyx:1465:33: local variable 'prod' referenced before assignment
warning: sage/rings/rational.pyx:1467:29: local variable 'prod' referenced before assignment
warning: sage/rings/rational.pyx:1768:20: local variable 'tmp' referenced before assignment
warning: sage/rings/rational.pyx:2427:20: local variable 'num' referenced before assignment
warning: sage/rings/rational.pyx:2428:20: local variable 'den' referenced before assignment
warning: sage/rings/rational.pyx:2763:22: local variable 'x' referenced before assignment
warning: sage/rings/rational.pyx:3576:14: local variable 'q' referenced before assignment
warning: sage/rings/rational.pyx:3577:14: local variable 'r' referenced before assignment
Paul
performance is not better in the general case:
sage: l=[(2^53+a)/(3^53+b) for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
10 loops, best of 3: 8.66 ms per loop
sage: l=[(2^53+a)/(3^53+b) for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 9.38 ms per loop
sage: l=[(2^53+a)/(3^53+b) for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 10.8 ms per loop
Paul
Replying to @zimmermann6:
with the new patch (I did not check with the old one) I get warnings
Sure, we get these warnings all over the place. One could say this is a Cython bug.
I added a second version which uses uint64_t
in the second part, which is hopefully slightly faster.
Description changed:
---
+++
@@ -57,3 +57,5 @@
....:
sage:
+ +Apply attachment: 14416_QQ_to_RDF_v2.patch
Replying to @zimmermann6:
- why the
or A.condition()
inmatrix/matrix_double_dense.pyx
?
To elaborate on this: X or Y
returns X
if bool(X)
is true and returns Y
if bool(X)
is false. The opposite for X and Y
.
the new version is slightly better in the general case:
sage: l=[(2^53+a)/(3^53+b) for a in [1..99] for b in [1..99]]
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 8.15 ms per loop
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 8.07 ms per loop
sage: %timeit c = map(RDF,l)
100 loops, best of 3: 7.97 ms per loop
Paul
I believe it will be difficult to do better, unless we avoid mpz and use mpn instead. Assuming all tests still work, I am fine with the new code.
Could anybody out there have a final look and give a positive review?
Paul
btw, a small typo in the patch: occured should be occurred...
Paul
I am definitely not able to review this ticket, but i would suggest to add the initial bug in the doctest to detect regression (not only comparing RDF with RR since they could change their behaviour simultaneously).
sage: RDF(1/10)*10 == RDF(1)
True
and even
sage: all([RDF(p/q) == RDF(p)/RDF(q) for p in [-100..100] for q in [1..100]])
True
Thierry, I asked Jeroen to remove from the doctest RDF(1/10)*10 == RDF(1)
since this is wrong if you replace 10 by say 49, so this was not a bug, but a feature of rounding.
The second test you propose is good (however there is a very small probability that RDF and RR change simultaneously, but the current test doesn't exercise negative p).
Paul
Replying to @sagetrac-tmonteil:
sage: all([RDF(p/q) == RDF(p)/RDF(q) for p in [-100..100] for q in [1..100]]) True
I would say this test is also misleading since RDF(p/q)
and RDF(p)/RDF(q)
don't have to be the same for large value of p
and q
.
I would say this test is also misleading since RDF(p/q) and RDF(p)/RDF(q) don't have to be the same...
yes, but for p, q integers in [-100,100] the conversion to RDF is exact.
Paul
Attachment: trac_14416_rounding_doctest-tm.patch.gz
Tested on sage 5.9.beta5, depends on #14448
Perhaps the following test has a better semantics, and still represents the initial bug.
for p in [-100..100]:
for q in [1..100]:
r = p/q
s, m, e = RDF(r).sign_mantissa_exponent()
if not abs(s*m*2^(e) - r) <= 2^(e-1):
print 'Bug #14416 reappeared with rational', r
Unfortunately, this lets me found a bug in the .sign_mantissa_exponent()
(which currently gives negative mantissa to negative numbers), see #14448.
Dependencies: #14448
OK, I added your doctest in a simplified way such that it doesn't depend on #14448.
Changed dependencies from #14448 to none
Jeroen, what is status of this ticket for the patchbot? Last time I looked, some tests were failing.
Paul
the following is weird:
One would expect that the conversion from 1/10 to RDF is done as follows:
For RR we get as a comparison:
More examples:
and for RR:
Apply attachment: 14416_QQ_to_RDF_v2.patch
Depends on #14335 Depends on #14336
CC: @robertwb
Component: basic arithmetic
Author: Paul Zimmermann, Jeroen Demeyer
Reviewer: Paul Zimmermann
Merged: sage-5.11.beta0
Issue created by migration from https://trac.sagemath.org/ticket/14416