MikeMcl / big.js

A small, fast JavaScript library for arbitrary-precision decimal arithmetic.
http://mikemcl.github.io/big.js
MIT License
4.87k stars 411 forks source link

Epsilon question #118

Closed jfstephe closed 5 years ago

jfstephe commented 5 years ago

Hi, For starters - awesome library(s) - many many thanks!!

I've been through your past issues for this and related libraries and (with the exception of MathJs, which uses DecimalJs) I can't see anything that implements or provides guidance regarding comparing values that don't quite add up.

var result = Big(1/6).add(Big(1/6).add(Big(1/6))).toString() console.log(result); // Gives: "0.49999999999999998"

Which is cool and I get why..

But If I wanted to compare 'result' to 0.5 it's not going to work. I could use an epsilon value from various posts on StackOverflow etc but I was sort of expecting a default parameter to the comparison operators to provide a value, or something along those lines.

If the reason isn't as cut and dry as this, may I suggest perhaps updating the documentation to add possible workaround/approaches/explaining why, otherwise people could sleepwalk into issues.

Thanks again, John

nycdotnet commented 5 years ago

It looks like you're still doing floating point division in JS. The problem is that 1/6 is not representable exactly in decimal. Big does work accurately with the numbers represented in decimal.

var result1 = Big(1/6).add(Big(1/6).add(Big(1/6)));
var result2 = Big('1').div(Big('6')).add(Big('1').div(Big('6'))).add(Big('1').div(Big('6')));
var result3 = Big('1').div(Big('6'));
var result4 = result3.times('3');

console.log(result1.toString()); // 0.49999999999999998
console.log(result2.toString()); // 0.50000000000000000001
console.log(result3.toString()); // 0.16666666666666666667
console.log(result4.toString()); // 0.50000000000000000001

I agree with you regarding the documentation in general, but I think the library is working as expected.

jfstephe commented 5 years ago

@nycdotnet - Hi, you are right to point out the floating point division. I'm not actually doing that in my code though. That was just an example. As you also illustrate, you can end up with calculations that 'should' be equal to 5 but aren't quite.

Regarding your comment, 'the library is working as expected'. This is probably true from the point of view of someone who's written the library and understands it's limitations etc. It's definitely not 100% true (as I'm in this group :-) ), as I expect

(Big('1').div(Big('6')).times(3)).eq(Big(0.5))

to return true :-). If I have to provide an epsilon value to assist with the evaluation I think that's fine, I was however, reading into getting a general value of epsilon last night and it doesn't seem like there's a good catch-all for this.

I can't possibly be the first person to run into this issue, so I'm wondering what people do to get round it when using this library? I'm assuming case specific 'fudges' :'-(, but happy to be wrong! :-)

nycdotnet commented 5 years ago

That's fair enough - I am just a user of this and I'll let @MikeMcl comment further.

MikeMcl commented 5 years ago

There is no single epsilon value that can be used with this library because a Big number can have any number of digits, and because the number of decimal places of the result of a division operation is configurable, for example, Big.DP = 40.

Generally, here, the round method is used to obviate accumulated imprecision. The number of decimal places to round to depends on the value of Big.DP.

Big.DP = 20;          // default
result = Big(1).div(6).times(3);
result.toString();    // "0.50000000000000000001"
result = result.round(Big.DP - 1);
result.toString();    // "0.5"

Rounding to DP - 1 decimal places will not always be appropiate though, it depends on the number and type of operations performed, the values passed to them, and the rounding mode.

Any imprecision will originate from an operation involving division as all other operations are exact. So, to determine how many decimal places you need to round to, you have to consider the maximum error that can result from a division and how that error is multiplied by subsequent operations.

In your example, the error introduced by the division is

0.16666666666666666667
-
0.16666666666666666666666...
=
0.00000000000000000000333...

When the result of the division is multiplied, the error will be multiplied also, but it will still be very small and the result of 0.50000000000000000001 is only out by 1 unit in the last (the 20th) decimal place.

Consider though,

Big(1).div(119).times(119).toString();    // "1.00000000000000000047"

The accumulated imprecision is greater. Here, it would not be okay to round to DP - 1, we would need to round to DP - 2 or less decimal places to get the expected result.

Big(1).div(119).times(119).round(Big.DP - 1).toString();    // "1.0000000000000000005"
Big(1).div(119).times(119).round(Big.DP - 2).toString();    // "1"

If you know the expected result is going to have, or you only care about it having, a limited number of decimal places, then just round to that number, or one or two decimal places more than that, for example

// An integer is expected, so round to zero decimal places
Big(1).div(119).times(119).round().toString();    // "1"

// A result with only a few decimal places is expected, so round to 3 decimal places
Big(1).div(6).times(3).round(3).toString();    // "0.5"
jfstephe commented 5 years ago

Thanks for responding so quickly guys. This helps a lot. I think the lack of any guidance was holding me back.

Side thought: Could the error in the division operation be tracked and passed through subsequent operations, being increased or decreased as it went? Then the dynamic epsilon value could be used in eq() etc, possibly on an optional parameter e.g.

function eq(otherNumber: Big, dynamicEpsilon: boolean = false) { ... }

FWIW, I think @MikeMcl's last post would be great to put in the README of all 3 of the maths libraries.

Happy for you to close this, or keep it open for reference for README updates etc.

MikeMcl commented 5 years ago

I suppose the epsilon, i.e. the maximum error, of a single division would be

eps = Big(10).pow(-Big.DP);

eps.toString();    // "1e-20"
eps.toFixed();     // "0.00000000000000000001"

If you can show an example of how that could be useful in a comparison operation, that would be great.

jfstephe commented 5 years ago

As above, I expected this to be true:

(Big('1').div(Big('6')).times(3)).eq(Big(0.5))

I understand why it fails, I am just humbly suggesting that either: a) the documentation is improved to explain why and how to possibly work around it b) the library changes and deals with this either implicitly or explicitly

Explicitly could be:

(Big('1').div(Big('6')).times(3)).eq(Big(0.5), Big.EPSILON)

This would follow along the lines of the ES6 Number.EPSILON, and being a documented API feature would help developers using the library to 'fall into the pit of success'.

Being part of the API would mean I don't need to do the equivalent of the below (from the mozilla.org site), and I'm more likely to notice it and avoid potential issues before they arise:

var result = Math.abs(0.2 - 0.3 + 0.1);
console.log(result);
// expected output: 2.7755575615628914e-17
console.log(result < Number.EPSILON);
// expected output: true
MikeMcl commented 5 years ago

Could the error in the division operation be tracked and passed through subsequent operations, being increased or decreased as it went?

I am not sure, but currently I am not able to explore how this could work or why it would be useful. If you are, then by all means post your thoughts and/or code here. The eps value above might be useful as a starting point. It is a matter of what I think is usually called the propagation of rounding errors.

jfstephe commented 5 years ago

I was thinking of something like an internal private flag (e.g. _divisionPerformed: boolean = false;) within the Big class.

When doing a divide operation this flag would be set to 'true' in the resultant Big. For example:

let a: Big = Big(1); let b: Big = Big(6); let c: Big = a.div(b); // c._divisionPerformed === true Then any subsequent operation would check that any constituent value had this set and pass the value through. e.g let d: Big = c.times(Big(3))); // d._divisionPerformed === true because it was set in 'c' Then in the comparison operators this could be checked. If it's true then the Epsilon value comes into play. // d.eq(Big(0.5)) === true! Happy days :-)

Hopefully that makes sense! This may be overkill and possibly not desirable all the time but that procedure was what was in my head :-).

MikeMcl commented 5 years ago

I had a play around with this idea today.

{
  const P = Big.prototype;

  const div = P.div;
  const eq = P.eq;
  const gt = P.gt;
  const gte = P.gte;
  const lt = P.lt;
  const lte = P.lte;
  const minus = P.minus;
  const plus = P.plus;
  const times = P.times;

  const BigEps = (n) => {
    const r = new Big(n);
    if (n && n.eps) r.eps = n.eps;
    return r;
  };

  const eps = new Big(1);
  let dp;

  P.div = function (n) {
    let x = this;
    let y = BigEps(n);
    const r = div.call(x, y);
    if (dp !== Big.DP) {
      dp = Big.DP;
      eps.e = -dp;
    }
    if (x.eps || y.eps) {
      if (x.eps) x = plus.call(x.abs(), x.eps);
      if (y.eps) y = minus.call(y.abs(), y.eps);
      // r.eps = (((abs(x) + x.eps)/(abs(y) - y.eps)) - abs(x/y)) + eps
      r.eps = plus.call(minus.call(div.call(x, y), r.abs()), eps);
    } else {
      r.eps = eps;
    }
    //console.log(`div: ${r} \xb1${r.eps}`);
    return r;
  };

  P.minus = function (n) {
    const r = minus.call(this, n);
    if (this.eps) r.eps = this.eps;
    if (n && n.eps) r.eps = plus.call(r.eps, n.eps);
    //console.log(`minus: ${r} \xb1${r.eps}`);
    return r;
  };

  P.plus = function (n) {
    const r = plus.call(this, n);
    if (this.eps) r.eps = this.eps;
    if (n && n.eps) r.eps = plus.call(r.eps, n.eps);
    //console.log(`plus: ${r} \xb1${r.eps}`);
    return r;
  };

  P.times = function (n) {
    let x = this;
    let y = BigEps(n);
    const r = times.call(x, y);
    if (x.eps) {
      if (y.eps) {
        x = plus.call(x.abs(), x.eps);
        y = plus.call(y.abs(), y.eps);
        r.eps = minus.call(times.call(x, y), r.abs());
      } else {
        r.eps = times.call(x.eps, y.abs());
      }
    } else if (y.eps) {
      r.eps = times.call(x.abs(), y.eps);
    }
    //console.log(`times: ${r} \xb1${r.eps}`);
    return r;
  };

  P.eq = function (n) {
    const x = this;
    const y = BigEps(n);
    let r = eq.call(x, y);
    if (!r && (x.eps || y.eps)) {
      let hi = y;
      let lo = y;
      if (x.eps) {
        hi = plus.call(hi, x.eps);
        lo = minus.call(lo, x.eps);
      }
      if (y.eps) {
        hi = plus.call(hi, y.eps);
        lo = minus.call(lo, y.eps);
      }
      //console.log('eq:');
      //console.log(`  x: ${x}`);
      //console.log(`  y: ${y}`);
      //console.log(`  hi: ${hi}`);
      //console.log(`  lo: ${lo}`);
      // x <= (y + x.eps + y.eps) && x >= (y - x.eps - y.eps)
      r = lte.call(x, hi) && gte.call(x, lo);
    }
    return r;
  };
}

let r;

r = Big(1).div(6).plus(Big(1).div(6)).plus(Big(1).div(6));
console.log(`r:         ${r.toFixed()}`);
console.log(`r.eps:     ${r.eps.toFixed()}`);
console.log(`r.eq(0.5): ${r.eq(0.5)}`);
console.log();

r = Big(1).div(119).times(119);
console.log(`r:       ${r.toFixed()}`);
console.log(`r.eps:   ${r.eps.toFixed()}`);
console.log(`r.eq(1): ${r.eq(1)}`);
console.log();
jfstephe commented 5 years ago

Yeah, that's what I was thinking about. What are your thoughts on it? Good/Bad/ok but not going to ever get in the library for these reasons....? :-)

When I think about this I keep thinking about some warning to devs that this is going on as the epsilon can grow in successive operations, but I suppose that's just the reality of the situation. I keep thinking about the ability to 'opt' out of this functionality (or opt in) but I don't know why you would do that, as you are only kidding yourself as to the accuracy of the calculations.

In terms of 'falling into the pit of success', IMHO this is 'more' correct (as 0.5 now equals 0.5 etc), but I'm keen to hear your thoughts.

MikeMcl commented 5 years ago

I've corrected some issues in the code above, though it still hasn't been properly tested.

I think it's very inefficient to track the maximum error in this way, so it is not something I want to add to the library itself.

To opt out, all you would have to do is, for example, result.eps = null, before you test whether result is equal to some value. (In the same way, you can manually set an eps on some value with, for example, result.eps = new Big(0.000001).)

I would quite like to put the above code in the Wiki, once it has been properly tested, so I would be grateful if you could help test it out a bit.


Again, it is so much simpler and more efficient to just use round. You can be as accurate as you want by simply increasing Big.DP and the argument passed to round. For example,

Big.DP = 50;

let r = Big(1).div(6).plus(Big(1).div(6)).plus(Big(1).div(6));
console.log( r.toFixed() );  // '0.50000000000000000000000000000000000000000000000001'
console.log( r.round(20).eq(0.5) );    // true

Here, even if the accumulated maximum error was equivalent to multiplying the inital error by 10000000000000000000000000000, then it would still not affect the result of the eq call. (In reality, the accumulated maximum error here is only 3 times the initial maximum error.)


If you don't like having to mess around with precision and rounding, and want exact results from division, then you may be looking for a big rational or fraction library.

jfstephe commented 5 years ago

Hi, Happy to try and help out, but I'm a bit up against it at work ATM. Should be calmer in a few weeks.

I'm reasonably happy for the rounding approach but I suppose I'd like to see some built in way offered via the api. It feels like I need to do too much thinking about the levels of rounding and DPs, understandably completely needed, but seems like too much work for something so (on the face of it) trivial.

For completeness, what are the efficiency concerns you have with tracking the epsilon value? If by default the functionality was as currently (i.e. no eps), then the only extra operations are a few 'if statements' AFAICT. Perhaps 'eps' could be null unless the user wanted to use it. Setting it to a recommended value would 'fix' the comparison issue but come at a performance cost. For me, in my current use case, accuracy is more important but in another context I could choose not to use eps and fall back on the round method is required.

MikeMcl commented 5 years ago

Hi there,

This is a minimalist library, so I am very reluctant to add any more code at all.

I think it's inefficient because it requires so many extra operations. For example, one div operation would need about six other arithmetic operations just to calculate the maximum error. In contrast, round is a quick single operation.

Also, it changes the meaning of eq from 'is equal to` to 'might be equal too', which I'm not keen on.

I made a start on some tests.

Big eps.zip

> node --experimental-modules test.mjs
jfstephe commented 5 years ago

Thanks. I'll have a look at the tests.

Regarding "it changes the meaning of eq from 'is equal to` to 'might be equal too' ",... I'm not sure it can be said that 'eq' is 'is equal to' as per the example above. It's more like 'is equal to with caveats'.

I think it comes down to a precision vs accuracy call. E.g. I would suggest that it's ok to not be precise but you must always be accurate.

This is precise but not accurate:

Big(1).div(6).plus(Big(1).div(6)).plus(Big(1).div(6)).eq(0.5) === false

This is accurate with the provided precision:

var ESP = ...; Big(1).div(6).plus(Big(1).div(6)).plus(Big(1).div(6)).eq(0.5, EPS) === true

MikeMcl commented 5 years ago

The assumption seems to be that Big(1).div(6) should equal a sixth exactly, but that assumption is false when using this arbitrary-precision library. The result of the eq method is always accurate.

As shown below, it is easy to add an epsilon parameter to eq if desired:

{
  const eq = Big.prototype.eq;

  Big.prototype.eq = function (n, eps) {
    const x = this;
    const y = new Big(n);
    let r = eq.call(x, y);
    if (!r && eps) {
      const k = new Big(eps);
      r = x.lte(y.plus(k)) && x.gte(y.minus(k));
    }
    return r;
  };
}

Big_eq_eps.zip