rzikm / NetQD

.NET implementation of the double-double and quad-double technique for achieving almost 128-bit and 256-bit floating point precision types.
MIT License
2 stars 0 forks source link

Simple test shows precision not better than double #2

Open theGuffa opened 3 years ago

theGuffa commented 3 years ago

I was checking out this library, but I find that it doesn't produce results that are correct to higher precision than just double precision.

First off, here is an implementation for you of the conversion from decimal that doesn't truncate the value to double precision:

public static explicit operator DdReal(decimal value) {
    double a = (double)value;
    double b = (double)(value - (decimal)a);
    return new DdReal(a, b);
}

Using that I can successfully convert a decimal value to DdReal and back to decimal without losing any precision.

I put together this to test the correctness of the arithmetic:

private static Random _rnd = new Random();

private static decimal DecimalRnd() {
    decimal sample = 1m;
    while (sample >= 1) {
        byte[] buf = new byte[8];
        _rnd.NextBytes(buf);
        int a = BitConverter.ToInt32(buf, 0);
        int b = BitConverter.ToInt32(buf, 4);
        int c = _rnd.Next(542101087);
        sample = new Decimal(a, b, c, false, 28);
    }
    return sample;
}

public static void Test() {
    decimal a = DecimalRnd();
    decimal b = DecimalRnd();
    DdReal aa = (DdReal)a;
    DdReal bb = (DdReal)b;
    Console.WriteLine($"Decimal: {a} + {b} = {(a + b)}");
    Console.WriteLine($"DdReal:  {(decimal)aa} + {(decimal)bb} = {(decimal)(aa + bb)}");
    Console.WriteLine($"Decimal: {a} - {b} = {(a - b)}");
    Console.WriteLine($"DdReal:  {(decimal)aa} - {(decimal)bb} = {(decimal)(aa - bb)}");
}

Example output:

Decimal: 0,1814580729426223105261687684 + 0,8262370913565243172775595613 = 1,0076951642991466278037283297
DdReal:  0,1814580729426223105261687684 + 0,8262370913565243172775595613 = 1,0076951642991501004477916328
Decimal: 0,1814580729426223105261687684 - 0,8262370913565243172775595613 = -0,6447790184139020067513907929
DdReal:  0,1814580729426223105261687684 - 0,8262370913565243172775595613 = -0,6447790184139019789958151773

As you see, the result is correct to 14 significant digits, which is what you get with just double arithmetic.


Update:

After trying to port several double-double libraries from different languages, I have come to the conclusion that it can't be done reliably in C#. According to the specification the C# compiler is free to choose between 64 bit and 80 bit precision for double calculations. As the double-double implementations rely on a specific precision, it can't be done in plain C#.

Ref: https://stackoverflow.com/questions/6683059/is-floating-point-math-consistent-in-c-can-it-be