thomasokken / free42

Free42 : An HP-42S Calculator Simulator
https://thomasokken.com/free42/
GNU General Public License v2.0
280 stars 54 forks source link

complex LN numerical issue #62

Closed achan001 closed 5 months ago

achan001 commented 7 months ago

complex LN has numerical issue (both binary/decimal version)

Test : x = (pi + pi*I) / 1e10, evaluate ln(y = 1+x) Note: log1p(x) ≈ ln(y) - (y-1-x)/y

Binary version reference:

p2> from gmpy2 import *
p2> x = const_pi() / 1e10
p2> x = mpc(x,x)
p2> y = 1+x
p2> log(y)
mpc('3.1415936518897064e-10+3.141592652602832e-10j')

lua> x = I(pi,pi) / 1e10 lua> I.log(1+x) -- error = -1 ULP (3.141593651889707e-10+3.141592652602832e-10*I)

Plus42Binary> LN(1+x) -- error = +954529 ULP 3.1415936513962265e-10+3.141592652602832e-10i

Same issue with Plus42Decimal

Y = 1+X = 1.000000000314159265358979323846264 + 3.141592653589793238462643383279503e-10i

LN(Y), Wolfram Alpha = -- reference 3.141592653589793238255931488895411E-10 + 3.141592652602832798560416032598054E-10i

LN(Y), Plus42Decimal = -- error = -3,452,423,245 ULP 3.141592653589793238255934941318656e-10 + 3.141592652602832798560416032598054e-10i

achan001 commented 7 months ago

I discovered complex LN numerical issue when testing latest function C.LN1+X

It is funny I had exactly the same experience when adding complex.log1p I've had to re-implement complex.log to fix numerical issue.

see https://www.hpmuseum.org/forum/thread-19784-post-171238.html#pid171238

complex.log lua code was later trranslated in C.

static Complex Clog(Complex z) {
    double x = Real(z), y = Imag(z);
    double h = x*x + y*y;
    if (0.5 < h && h < 3.0) {
        if (fabs(x) < fabs(y)) {h=x; x=y; y=h;}
        h = log1p((x+1)*(x-1) + y*y)/2;
    } else
        h = log(cabs(z));
    return CMPLX(h, carg(z));
}
achan001 commented 7 months ago

The good news, C.E^X-1 function looks good.

X = 3.141592653589793238462643383279503e-10 + 3.141592653589793238462643383279503e-10i

L = LN(Y) - (Y-1-X)/Y, using WA accurate LN(Y) 3.141592653589793238255934872174913e-10 + 3.141592652602832798560416032598053e-10i

C.E^X-1(L) = -- round-trip back to X :-) 3.141592653589793238462643383279503e-10 + 3.141592653589793238462643383279503e-10i

thomasokken commented 7 months ago

I implemented your fix for complex LN, and applied the same code for complex LOG as well (using LOG = LN / LN(10)): dd501ebc4e0cf5dd3b759d82b47a686f5faff8c3

Doing the same for C.LN1+X is going to be a bit more complicated; I haven't looked into that yet.

achan001 commented 7 months ago

Doing the same for C.LN1+X is going to be a bit more complicated

Good LN is all it take. No changes needed for C.LN1+X

y=1+x --> log1p(x) ≈ ln(y) - (y-1-x)/y

thomasokken commented 7 months ago

Ah, OK. So that means I shouldn't need any more code changes for this issue, then? The revised code returns results that match your test cases for (pi+1, pi), binary and decimal. Let me know if you would like to take a look at a test build (and if so, for which OS). I'll try to include the ATANH fix as well; I'm starting on that right now.

achan001 commented 7 months ago
ln(√(x^2+y^2)) = log1p((x+1)*(x-1) + y*y)/2;

I've found a more accurate way, (x+1) (x-1) = 2ε + ε^2, where ε = |x|- 1

Clog version 1 Clog version 2

thomasokken commented 5 months ago

I'm getting back to this issue after having spent a few weeks on other Free42 & Plus42 work. I see you've posted several times at MoHPC. Which version would you recommend I use for LN and LOG? Is "Clog version 2" from your last post here still the best?

achan001 commented 5 months ago

ln(|z|(1+ε)) ≈ ln(|z|) + ε relerr(|z|) = ε      ⇒ relerr(ln(|z|) ≈ ε / ln(|z|)

If |z| is not close to 1, ln(|z|) can't improve much using log1p In that sense, both version 1 and 2 are equally good.

I use version 2 for its elegance without branching for edge cases.

thomasokken commented 5 months ago

The version 2 code appears to use logb() and ldexp() as each other's inverses, but logb() uses base FLT_RADIX and ldexp() uses base 2. Is that correct? It looks like it should use scalbn() instead of ldexp() if it's going to work with decimal floating point, or else it should use log2() instead of logb() if the scaling calculation is supposed to use base 2 even with decimal floating point.

Also, is the shift_ln2() function compatible with decimal floating point?

achan001 commented 5 months ago

You are correct. logb should match with scalbn.

complex.log version 2 code assumed FLT_RADIX = 2. Actually, math library I used also assumed base 2.

https://github.com/JuliaMath/openlibm/blob/master/src/s_logb.c https://github.com/JuliaMath/openlibm/blob/master/src/s_scalbn.c

To make version 2 work with decimal is more involved. We also need accurate shift_ln10(x,b) = x + b*ln(10) Or, do everything in base 2, then convert back to decimal. It may be too much hassle for not much gain in accuracy.

I would just replace (x+1)*(x-1) ≡ (2ε + ε²), where ε = |x| - 1

thomasokken commented 5 months ago

Wouldn't doing everything in base 2 and then converting back to decimal mean getting results with only 16 out of 34 digits being accurate?

achan001 commented 5 months ago

No.

if |z| not close to 1, dec→bin conversion error does not matter. if |z| close to 1, ε = dec→bin ( |x|-1 ) should have almost all sig. digits. https://en.wikipedia.org/wiki/Sterbenz_lemma, ( |x|-1 ) is exact

Intel Decimal128 internally does dec → bin → dec conversion anyway. Example, bid128_log1p.c, log1p after handled edge cases:

   { BIDECIMAL_CALL1(bid128_to_binary128,xd,x);
     __bid_f128_log1p(yd, xd);
     BIDECIMAL_CALL1(binary128_to_bid128,res,yd);
     BID_RETURN(res);
   }
thomasokken commented 5 months ago

Intel Decimal128 internally does dec → bin → dec conversion anyway.

Yes, but Free42 Decimal uses 128-bit decimal numbers, so if you do calculations using 64-bit binary double, you end up taking a 16-digit result and presenting it as if it had 34 actual digits of precision.

achan001 commented 5 months ago

I now understand why you say binary only give 16 out of 34 digits.

When I say do everything in base 2, then convert back to decimal, I meant decimal128 to binary128 (113 bits ≈ 34 decimal digits)

This is messy (2 versions of complex log), without much gain in accuracy. That's why I suggested not use version 2.

I would just replace (x+1)*(x-1) ≡ (2ε + ε²), where ε = |x| - 1

achan001 commented 5 months ago

Looks good.

Same reason mappable_ln_c() used 0x1p61 for binary subnormal case, I would use 1E38 to scale sub-normal decimal case, not 1E37

ln(1E37) ≈ 85.19564844077969030866568382332148 + -0.432 ULP ln(1E38) ≈ 87.49823353377373599268367527800584 + -0.011 ULP

thomasokken commented 5 months ago

Great, thanks!

I also did the changes for ATANH and C.LN1+X. I combined everything in one commit: https://github.com/thomasokken/free42/commit/96bcd842188cd410b95c58adbb0847e11684ec62

I uploaded test builds incorporating everything to https://thomasokken.com/free42/download/test/

Note that these do not include your suggestion about using 1E38 for the decimal subnormal case; I'll add that before building the official release.

thomasokken commented 5 months ago

Fixed.

I uploaded new test builds, they're in the same location as before, https://thomasokken.com/free42/download/test/

thomasokken commented 5 months ago

I released Free42 3.1.6 and Plus42 1.1.8 today, containing these changes.

achan001 commented 5 months ago

I would use 1E38 to scale sub-normal decimal case, not 1E37

Turns out, the constant ULP error really doesn't matter. ln(1E37) ≈ 85, but it is only used when |z|^2 underflow/overflow.

| ln(|z|) | ≥ ln(1E6145) / 2 ≈ 7075

Decimal offset is at least 2 magnitude smaller.


BTW, in decimal128, ln(1E38) ≠ ln(10) * 38

10 LN 38 ×      → 87.49823353377373599268367527800583 1E38 LN           → 87.49823353377373599268367527800584

Numerically, ln(a^b) may not equal ln(a) * b It may be better not to waste the accurate constant.

#ifdef BCD_MATH
phloat m = scalbn(phloat(1), 38);
#else
phloat m = 0x2000000000000000; // 2^61
#endif
phloat b = -log(m);