Improved handling of numeric precision

usethesource / rascal

The implementation of the Rascal meta-programming language (including interpreter, type checker, parser generator, compiler and JVM based run-time system)

http://www.rascal-mpl.org

Other

405 stars 78 forks source link

Improved handling of numeric precision #48

Closed PaulKlint closed 8 years ago

PaulKlint commented 11 years ago

I have made changes to PDB and Rascal to improve handling of precision:

The precision can now be controlled with "setPrecision" (it returns the previous precision).
The precision of individual numbers can be controlled with "precision" (it returns a number with given precision).
The precision of all numeric computations is now limited to the precision that is set.
I have set the default precision at 10 (was 50). Maybe it should be set even lower.

All-in-all this will speed-up computations with moderate precision.

DavyLandman commented 11 years ago

nice, but this is not really an issue right?

You could of course fake it, by making a issue regarding precision and then adding to the commit message fix #48 (perhaps the merge commit) and you could then add this information in the commit message.

Or perhaps email this.

jurgenvinju commented 11 years ago

These are good improvements!

What I think would be very nice if precision was not part of the state of the interpreter, but rather part of the state of every number, and that you could see the precision from the printed number. wysiwyg principle again.

Integers and rationals are maximally precise
The precision of reals is shown by the number of digits printed as in: 0.0 and 1.00 and 1.0001
If you want to multiply by 2.5 with 2 digit significant decimals, then you type 2.50 * x
If you want to multiply by 2.5 with the precision of the other argument, you type 5r2 * x
This would ensure that people see what the precision is and that they never end up with higher precision than what should be reported
If we add precision to the type system, with the units, then we can statically inform the user when precision is lost accidentally

PaulKlint commented 11 years ago

@jurgenvinju:

Precision is already part of every number.
Only the maximal allowed precision is part of the value factory.
Integers and rationals are already maximally precise.
The precision of reals is already printed as you suggest.
I think it is a _bad_ idea to handle precision by way of digits of constants. A much better way would be (as Mathematica does) to have a locally scoped construct that locally redefines the precision, e.g. in Mathematica N[ sin(x), 10 ] computes sin(x) with precision 10. This is a nice idea since it can be applied to all numeric computations (the alternative is to add an optional precision parameter to every numeric function). If we would have call by name parameters, we could define such an N function (or add a separate language construct for it). In Python, the with construct is used to locally redefine the numeric context.
I think that the only thing we can do statically is to check that an expression with higher precision is assigned to a variable with lower precision. Other losses of precision are inherently dynamic. Including precision in the type system is also not good for reusability: it should be possible to reuse the same function in contexts that require different precision.

jurgenvinju commented 11 years ago

Why is it bad to have literal precision? It's simple and easy and correct. And values will be printed the way they are typed, which is a rascal design principle.

If you need parametrized or contextual precision you can always use a function instead of a literal, or a maximally precise literal such as an int or rat.

For reuse the type system should allow for precision parameters...

On Wednesday, December 5, 2012, Paul Klint wrote:

@jurgenvinju https://github.com/jurgenvinju:

Precision is already part of every number.

Only the maximal allowed precision is part of the value factory.

Integers and rationals are already maximally precise.

The precision of reals is already printed as you suggest.

I think it is a bad idea to handle precision by way of digits of constants. A much better way would be (as Mathematica does) to have a locally scoped construct that locally redefines the precision, e.g. in Mathematica N[ sin(x), 10 ] computes sin(x) with precision 10. This is a nice idea since it can be applied to all numeric computations (the alternative is to add an optional precision parameter to every numeric function). If we would have call by name parameters, we could define such an N function (or add a separate language construct for it). In Python, the with construct is used to locally redefine the numeric context.

I think that the only thing we can do statically is to check that an expression with higher precision is assigned to a variable with lower precision. Other losses of precision are inherently dynamic. Including precision in the type system is also not good for reusability: it should be possible to reuse the same function in contexts that require different precision.

— Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11033548.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1
INRIA Lille - ATEAMS
Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

jurgenvinju commented 11 years ago

I think we do not want maximal precision. We want minimal precision: for efficiency and for helping the programmer manage significant decimals.

On Wednesday, December 5, 2012, Paul Klint wrote:

@jurgenvinju https://github.com/jurgenvinju:

Precision is already part of every number.

Only the maximal allowed precision is part of the value factory.

Integers and rationals are already maximally precise.

The precision of reals is already printed as you suggest.

I think it is a bad idea to handle precision by way of digits of constants. A much better way would be (as Mathematica does) to have a locally scoped construct that locally redefines the precision, e.g. in Mathematica N[ sin(x), 10 ] computes sin(x) with precision 10. This is a nice idea since it can be applied to all numeric computations (the alternative is to add an optional precision parameter to every numeric function). If we would have call by name parameters, we could define such an N function (or add a separate language construct for it). In Python, the with construct is used to locally redefine the numeric context.

I think that the only thing we can do statically is to check that an expression with higher precision is assigned to a variable with lower precision. Other losses of precision are inherently dynamic. Including precision in the type system is also not good for reusability: it should be possible to reuse the same function in contexts that require different precision.

— Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11033548.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1
INRIA Lille - ATEAMS
Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

PaulKlint commented 11 years ago

@jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

jurgenvinju commented 11 years ago

What happens if you cap the precision of arithmetic to minimal precision of the operands? (i.e. minimal digits after the comma for + and -, and minimal number of total digits for * and /)

This should stop increasing the precision by 1 when multiplying/dividing, and it should keep precision small as well. By doing this, I think you do not need to estimate the precision based on a derivative, you can just compute the precision of the result incrementally while doing the computation. (The issue with this is rounding errors will propagate)

On Wed, Dec 5, 2012 at 12:17 PM, Paul Klint notifications@github.comwrote:

@jurgenvinju https://github.com/jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

— Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11037689.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1
INRIA Lille - ATEAMS
Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

DavyLandman commented 11 years ago

if i'm not mistaken you can supply a MathContext to most operations on BigDecimals, and a MathContext contains precision information.

On Wed, Dec 5, 2012 at 1:19 PM, Jurgen J. Vinju notifications@github.comwrote:

What happens if you cap the precision of arithmetic to minimal precision of the operands? (i.e. minimal digits after the comma for + and -, and minimal number of total digits for * and /)

This should stop increasing the precision by 1 when multiplying/dividing, and it should keep precision small as well. By doing this, I think you do not need to estimate the precision based on a derivative, you can just compute the precision of the result incrementally while doing the computation. (The issue with this is rounding errors will propagate)

On Wed, Dec 5, 2012 at 12:17 PM, Paul Klint notifications@github.comwrote:

@jurgenvinju https://github.com/jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

— Reply to this email directly or view it on GitHub< https://github.com/cwi-swat/rascal/issues/48#issuecomment-11037689>.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1

INRIA Lille - ATEAMS

Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

—

Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11039302.

PaulKlint commented 11 years ago

@DavyLandman yes, I use MathContext to specify precision and rounding for all operations.

@jurgenvinju a simple answer: _numerical disaster strikes_ if you take minimal precision of operands. Here is an example using Rump's polynomial (which is famous for its numerical instability):

public num rump(num x, num y){
 x2 = x * x;
 y2 = y * y;
 y4 = y2 * y2;
 y6 = y4 * y2;
 y8 = y4 * y4;
 return (1335 * y6) / 4 + x2 * (11 * x2 * y2 - y6 - 121 * y4 - 2) + (11 * y8) / 2 + x / (2 * y);
}

The correct answer for rump(77617, 33096) is -0.827396059946821368141165095479816291999

If I replace the precision of operators by the minimal precision of their operands you get: -17111339274712207494296632228773888.82740 which is not even near the right answer.

anyahelene commented 11 years ago

On 12/05/2012 12:17 PM, Paul Klint wrote:

@jurgenvinju https://github.com/jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

As I casual user, I think I'd expect 1. / 8. to be .125 (three digits), and

* 456. to be 56088. (5 digits) not 56090. I may expect some degree of rounding on division (but hopefully not worse than when using floats).

But, dealing with measurements with limited precision, it would make sense to choose the precision of the least precise operand, to avoid introducing false precision.

I do like Jurgen's idea (I suggested something similar a while back, with a pN suffix for N digits of precision), but I object to this point:

If you want to multiply by 2.5 with the precision of the other argument, you type 5r2 * x

since you're then taking an exact number, and implicitly promoting it to an inexact number.

It may make more sense to set a minimum precision of, say, 6 digits – or combine rationals and reals to limited precision rationals with exponents.

-anya

PaulKlint commented 11 years ago

@anyahelene at your service ;-) with the current setup (and default maximal precision 10) you get exactly what you ask for (minimal precision depending on the arguments):

rascal>1./8
real: 0.125

rascal>123. * 456.
real: 56088.

anyahelene commented 11 years ago

On 12/05/2012 03:04 PM, Paul Klint wrote:

@anyahelene https://github.com/anyahelene at your service ;-) with the current setup (and default maximal precision 10) you get exactly what you ask for (minimal precision depending on the arguments):

|rascal>1./8 real: 0.125

rascal>123. * 456. real: 56088. |

Yes, I tried. ;)

What was wrong with the current setup? Too slow?

-anya

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/48#issuecomment-11042363.

jurgenvinju commented 11 years ago

thanks for your thoughts Anya.

On Wed, Dec 5, 2012 at 2:29 PM, Anya Helene Bagge notifications@github.comwrote:

On 12/05/2012 12:17 PM, Paul Klint wrote:

@jurgenvinju https://github.com/jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

As I casual user, I think I'd expect 1. / 8. to be .125 (three digits), and

* 456. to be 56088. (5 digits) not 56090. I may expect some degree of rounding on division (but hopefully not worse than when using floats).

But, dealing with measurements with limited precision, it would make sense to choose the precision of the least precise operand, to avoid introducing false precision.

I do like Jurgen's idea (I suggested something similar a while back, with a pN suffix for N digits of precision), but I object to this point:

If you want to multiply by 2.5 with the precision of the other argument, you type 5r2 * x

since you're then taking an exact number, and implicitly promoting it to an inexact number.

It may make more sense to set a minimum precision of, say, 6 digits – or combine rationals and reals to limited precision rationals with exponents.

-anya

— Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11041215.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1
INRIA Lille - ATEAMS
Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

jurgenvinju commented 11 years ago

@paulklint wow. that's one hell of an example. It implies precision has to be managed by the algorithm rather than by the state of the input/output numbers (as you said earlier). If we could scope that lexically and explicitly using language support, that would make things clear indeed.

On Thu, Dec 6, 2012 at 9:26 AM, Jurgen Vinju jurgen@vinju.org wrote:

thanks for your thoughts Anya.

On Wed, Dec 5, 2012 at 2:29 PM, Anya Helene Bagge < notifications@github.com> wrote:

On 12/05/2012 12:17 PM, Paul Klint wrote:

@jurgenvinju https://github.com/jurgenvinju If we use the precision of a literal as guidance then we cannot use that literal as guidance in different precision contexts: we have to write a different literal for each precision in the same code.

I agree that we want minimal precision, but this is very hard to achieve in general. For instance: estimating the minimal precision of a function result based on the precision of its argument requires using the derivative of that function to estimate the precision of its result.

What I have currently implemented: the result of arithmetic operations uses the maximal precision of the operands but is capped by the maximal allowed precision. So the precision will be smaller than the upperbound when possible. What is not yet so nice is that multiplication always increases the precision by 1 (this means that multiplications in loops rapidly lead to the upperbound on the precision) and I am not yet sure what happens if we limit it to the precision of the operand with the largest precision (again bounded by a max).

As I casual user, I think I'd expect 1. / 8. to be .125 (three digits), and

* 456. to be 56088. (5 digits) not 56090. I may expect some degree of rounding on division (but hopefully not worse than when using floats).

But, dealing with measurements with limited precision, it would make sense to choose the precision of the least precise operand, to avoid introducing false precision.

I do like Jurgen's idea (I suggested something similar a while back, with a pN suffix for N digits of precision), but I object to this point:

If you want to multiply by 2.5 with the precision of the other argument, you type 5r2 * x

since you're then taking an exact number, and implicitly promoting it to an inexact number.

It may make more sense to set a minimum precision of, say, 6 digits – or combine rationals and reals to limited precision rationals with exponents.

-anya

— Reply to this email directly or view it on GitHubhttps://github.com/cwi-swat/rascal/issues/48#issuecomment-11041215.

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1

INRIA Lille - ATEAMS

Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

Jurgen Vinju

Centrum Wiskunde & Informatica - SEN1
INRIA Lille - ATEAMS
Universiteit van Amsterdam

www: http://jurgen.vinju.org, http://www.rascal-mpl.nl, http://twitter.com/jurgenvinju skype: jurgen.vinju

msteindorfer commented 11 years ago

@PaulKlint, might it be that this change broke the PDB tests? The first assertion of the following randomly executed test case fails:

     * Associativity: addition and multiplication
     * 
     *  (Possibly not strictly true for reals.)
     */
    public void axiomAssociativity(INumber a, INumber b, INumber c) {
        assertEqual(a.add(b.add(c)), a.add(b).add(c));
        assertEqual(a.multiply(b.multiply(c)), a.multiply(b).multiply(c));
    }

anyahelene commented 11 years ago

On 12/06/2012 12:23 PM, Michael Steindorfer wrote:

@PaulKlint https://github.com/PaulKlint, might it be that this change broke the PDB tests? The first assertion of the following randomly executed test case fails:

* Associativity: addition and multiplication * (Possibly not strictly true for reals.) / public void axiomAssociativity(INumber a, INumber b, INumber c) { assertEqual(a.add(b.add(c)), a.add(b).add(c)); assertEqual(a.multiply(b.multiply(c)), a.multiply(b).multiply(c)); }

Well, as mentioned in the comment, you can't really expect associativity from reals. Had I been less lazy, the axiom might have if the arguments were reals before doing the check.

Perhaps INumber should have an isExact() method?

Real add/multiply should be associative if the numbers are of more or less the same magnitude (I believe), so restricting the test to that may be a solution (or just use a few concrete test cases, instead of the axiom).

-anya

— Reply to this email directly or view it on GitHub https://github.com/cwi-swat/rascal/issues/48#issuecomment-11080855.

anyahelene commented 11 years ago

On 12/06/2012 12:23 PM, Michael Steindorfer wrote:

@PaulKlint https://github.com/PaulKlint, might it be that this change broke the PDB tests? The first assertion of the following randomly executed test case fails:

* Associativity: addition and multiplication * (Possibly not strictly true for reals.) / public void axiomAssociativity(INumber a, INumber b, INumber c) { assertEqual(a.add(b.add(c)), a.add(b).add(c)); assertEqual(a.multiply(b.multiply(c)), a.multiply(b).multiply(c)); }

Further explorations reveal that basically all the IReal tests are broken, even the simple ones like:

        assertEqual(REAL_ZERO, r.add(r.negate()));

(Expected 0.0 got -145224193.)

(Should really split these tests so that one failure doesn't cover for others.)

-anya