dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

System.Numerics.BigRational and System.Numerics.MathR #71791

Open c-ohle opened 2 years ago

c-ohle commented 2 years ago

Background and motivation

I want to suggest a novel arbitrary base number type for namespace System.Numerics.

I love C#. I miss a base number type for arbitrary precisission that is really usable in practice.
Such a number type should have the properties of a System.String:
a simple binary unique always normalized immutable readonly struct in machine-word size.
This would ensure that it can be passed around internally as quickly as possible and makes it efficient to use it for client types like vectors and matrices.

The general problem with NET and arbitrary number types.

It is so nice and easy to implement custom number types in C# by define numeric operators and implicit or explicit conversions.
For value struct types it works perfectly as long as they have a fixed size.
However, there are major problems for any kind of arbitrary precision type.
No problem if we have a single operation like: a = b + c;
The the + operator function can alloc the necessary memory resources, create and return a final object as needed for a.
But in practice we have expressions like this: a = (b c + d e) / f;
We usually have many intermediate results, much more than final results, but the intermediate results don't require memory allocs and setting up fully qualified, normalized final result objects.
The C# compiler simply generate code that calls the operator functions and the operator function gets no indication that only an intermediate result is needed, which could be managed in buffers.

BigInteger and BigRational.

Both types are closely related.
They need the same arithmetic big-integer calculation core whereby, for efficient rational arithmetics it needs some more bit level functions..
It looks like a logical consequence to setup a BigRational type from two BigInteger.
But that only doubles the problems:
BigInteger has already a 2 machine-word size, a BigRational based on BigInteger gets 4.
A Vector3R with X, Y, Z gets 12 and so on.
For every interimes result we get 12 times allocations and unnecessary normalizations.
This is far from any efficency.

The new approach, a BigRational type for NET that solves the problems

To bypass the mentioned problems I want to propose a general new system with significantly better performance and better memory usage. At first the data structure:

  public unsafe readonly struct BigRational : IComparable<BigRational>, IEquatable<BigRational>, ...
  {
    private readonly uint[] p; // the one and only instance field what gives BigRational a machine word size

    public class CPU // nested for access to BigRational uint[] p
    {
       private int i; private uint[][] p; //only this two fields, necessary to build a stack-machine
    }

    [ThreadStatic]
    private static CPU? cpu; // a static threadlocal shared instance of a CPU object
    public static CPU task_cpu => cpu ??= new CPU(); // accessor as static property  
  }

This is all, BigRational has only one array field and therefore the desired machine-word size and has furthermore the immutable properties like a string object. A nested public class CPU what represents a stack-machine and encapsulates the calculation core. Nested as only class that has access to the private BigRational creator and access to the data pointer - for safety. Everything as small and secure as possible, no additional system resources required. Not even predefined default values, nothing.

The data format used in the BigRational uint[] data pointer is the same as used in the stack-machine stack entries and is also simple as possible:

image

The first uint a sign bit and the number of following uint digits for the numerator, followed by a uint with the number of unit digits for the denominator.
This sequential format has the benefit that the internal integer arithmetics need only one pointer and not ranges, conversions, case handling etc. and this allows the best possible troughput.
A BigRational with null data pointer is defined as the number Zero (0) and therfore the value struct default is a valid number.

The Interfaces

BigRational itself acts like a empty hull to encapsulate the data pointer and provides a save first level user interface for a number type with all the number typical operators:

  public readonly struct BigRational :
    IComparable<BigRational>, IComparable, IEquatable<BigRational>, IFormattable, ISpanFormattable,
    INumber<BigRational>, ISignedNumber<BigRational>, IPowerFunctions<BigRational>, IRootFunctions<BigRational>,
    IExponentialFunctions<BigRational>, ILogarithmicFunctions<BigRational>, ITrigonometricFunctions<BigRational>,
    IHyperbolicFunctions<BigRational>
  {
    public override readonly string ToString();
    public readonly string ToString(string? format, IFormatProvider? provider = default);
    public readonly bool TryFormat(Span<char> destination, out int charsWritten, ReadOnlySpan<char> format = default, IFormatProvider? provider = null); public static BigRational Parse(ReadOnlySpan<char> value, IFormatProvider? provider = null);
    public static BigRational Parse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider? provider);
    public static BigRational Parse(string s, NumberStyles style, IFormatProvider? provider);
    public static BigRational Parse(string s, IFormatProvider? provider);
    public static bool TryParse(ReadOnlySpan<char> s, NumberStyles style, IFormatProvider? provider, out BigRational result);
    public static bool TryParse([NotNullWhen(true)] string? s, NumberStyles style, IFormatProvider? provider, out BigRational result);
    public static bool TryParse(ReadOnlySpan<char> s, IFormatProvider? provider, out BigRational result);
    public static bool TryParse([NotNullWhen(true)] string? s, IFormatProvider? provider, out BigRational result);
    public readonly int WriteToBytes(ref Span<byte> destination);
    public static BigRational ReadFromBytes(ref ReadOnlySpan<byte> value);
    public override readonly int GetHashCode();
    public override readonly bool Equals([NotNullWhen(true)] object? obj);
    public readonly bool Equals(BigRational b);
    public readonly int CompareTo(object? obj);
    public readonly int CompareTo(BigRational b);
    public readonly int CompareTo(long b);
    public BigRational(float value);
    public BigRational(double v);
    public static implicit operator BigRational(byte value);
    public static implicit operator BigRational(sbyte value);
    public static implicit operator BigRational(ushort value);
    public static implicit operator BigRational(char value);
    public static implicit operator BigRational(short value);
    public static implicit operator BigRational(int value);
    public static implicit operator BigRational(uint value);
    public static implicit operator BigRational(long value);
    public static implicit operator BigRational(nint value);
    public static implicit operator BigRational(nuint value);
    public static implicit operator BigRational(ulong value);
    public static implicit operator BigRational(Int128 value);
    public static implicit operator BigRational(UInt128 value);
    public static implicit operator BigRational(BigInteger value);
    public static implicit operator BigRational(Half value);
    public static implicit operator BigRational(float value);
    public static implicit operator BigRational(double value);
    public static implicit operator BigRational(NFloat value);
    public static implicit operator BigRational(decimal value);
    public static explicit operator BigRational(string value);
    public static explicit operator byte(BigRational value);
    public static explicit operator sbyte(BigRational value);
    public static explicit operator short(BigRational value);
    public static explicit operator char(BigRational value);
    public static explicit operator int(BigRational value);
    public static explicit operator ushort(BigRational value);
    public static explicit operator long(BigRational value);
    public static explicit operator ulong(BigRational value);
    public static explicit operator nint(BigRational value);
    public static explicit operator nuint(BigRational value);
    public static explicit operator Int128(BigRational value);
    public static explicit operator UInt128(BigRational value);
    public static explicit operator BigInteger(BigRational value);
    public static explicit operator Half(BigRational value);
    public static explicit operator float(BigRational value);
    public static explicit operator double(BigRational value);
    public static explicit operator NFloat(BigRational value);
    public static explicit operator decimal(BigRational value);
    public static explicit operator ReadOnlySpan<uint>(BigRational value);
    public static BigRational operator +(BigRational a);
    public static BigRational operator -(BigRational a);
    public static BigRational operator ++(BigRational value);
    public static BigRational operator --(BigRational value);
    public static BigRational operator +(BigRational a, BigRational b);
    public static BigRational operator -(BigRational a, BigRational b);
    public static BigRational operator *(BigRational a, BigRational b);
    public static BigRational operator /(BigRational a, BigRational b);
    public static BigRational operator %(BigRational a, BigRational b);
    public static BigRational operator +(BigRational a, long b);
    public static BigRational operator -(BigRational a, long b);
    public static BigRational operator *(BigRational a, long b);
    public static BigRational operator /(BigRational a, long b);
    public static BigRational operator +(long a, BigRational b);
    public static BigRational operator -(long a, BigRational b);
    public static BigRational operator *(long a, BigRational b);
    public static BigRational operator /(long a, BigRational b);
    public static bool operator ==(BigRational a, BigRational b);
    public static bool operator !=(BigRational a, BigRational b);
    public static bool operator <=(BigRational a, BigRational b);
    public static bool operator >=(BigRational a, BigRational b);
    public static bool operator <(BigRational a, BigRational b);
    public static bool operator >(BigRational a, BigRational b);
    public static bool operator ==(BigRational a, long b);
    public static bool operator !=(BigRational a, long b);
    public static bool operator <=(BigRational a, long b);
    public static bool operator >=(BigRational a, long b);
    public static bool operator <(BigRational a, long b);
    public static bool operator >(BigRational a, long b);
    public static int Sign(BigRational a);
    public static bool IsInteger(BigRational a);
    public static bool IsNaN(BigRational a);
    public static BigRational Abs(BigRational a);
    public static BigRational Min(BigRational a, BigRational b);
    public static BigRational Max(BigRational a, BigRational b);
    public static BigRational Truncate(BigRational a);
    public static BigRational Floor(BigRational a);
    public static BigRational Ceiling(BigRational a);
    public static BigRational Factorial(int a);
    public static BigRational Round(BigRational a);
    public static BigRational Round(BigRational a, int digits);
    public static BigRational Round(BigRational a, int digits, MidpointRounding mode);
    public static BigRational Pow(int a, int b);
    public static BigRational Pow(BigRational a, int b);
    public static BigRational Pow(BigRational x, BigRational y, int digits);
    public static BigRational Pow2(BigRational x, int digits);
    public static BigRational Sqrt(BigRational a, int digits);
    public static BigRational Log2(BigRational x, int digits);
    public static BigRational Log10(BigRational x, int digits);
    public static int ILog10(BigRational a);
    public static BigRational Log(BigRational x, int digits);
    public static BigRational Exp(BigRational x, int digits);
    public static BigRational Pi(int digits);
    public static BigRational Pi();
    public static BigRational Tau(int digits);
    public static BigRational Tau();
    public static BigRational Sin(BigRational x, int digits);
    public static BigRational Cos(BigRational x, int digits);
    public static BigRational Tan(BigRational x, int digits);
    public static BigRational Asin(BigRational x, int digits);
    public static BigRational Acos(BigRational x, int digits);
    public static BigRational Atan(BigRational x, int digits);
    public static BigRational Atan2(BigRational y, BigRational x, int digits);
    public static BigRational GreatestCommonDivisor(BigRational a, BigRational b);
    public static BigRational LeastCommonMultiple(BigRational a, BigRational b);
    public static BigRational IDiv(BigRational a, BigRational b);
    public static BigRational IMod(BigRational a, BigRational b);
    public static BigRational DivRem(BigRational a, BigRational b, out BigRational r);
    public static (BigRational Quotient, BigRational Remainder) DivRem(BigRational a, BigRational b);
    public static BigRational NumDen(BigRational a, out BigRational den);
    public static BigRational Clamp(BigRational value, BigRational min, BigRational max);
    public static BigRational Pow(BigRational x, BigRational y);
    public static BigRational Sqrt(BigRational x);
    public static BigRational Cbrt(BigRational x);
    public static BigRational Hypot(BigRational x, BigRational y);
    public static BigRational Root(BigRational x, int n);
    public static BigRational Exp(BigRational x);
    public static BigRational Exp10(BigRational x);
    public static BigRational Exp2(BigRational x);
    public static BigRational Log(BigRational x);
    public static BigRational Log(BigRational x, BigRational newBase);
    public static BigRational Log10(BigRational x);
    public static BigRational Log2(BigRational x);
    public static BigRational Sin(BigRational x);
    public static BigRational Cos(BigRational x);
    public static BigRational Tan(BigRational x);
    public static BigRational Asin(BigRational x);
    public static BigRational Acos(BigRational x);
    public static BigRational Atan(BigRational x);
    public static BigRational Atan2(BigRational y, BigRational x);
    public static (BigRational Sin, BigRational Cos) SinCos(BigRational x);
    public static BigRational AcosPi(BigRational x);
    public static BigRational AsinPi(BigRational x);
    public static BigRational Atan2Pi(BigRational y, BigRational x);
    public static BigRational AtanPi(BigRational x);
    public static BigRational CosPi(BigRational x);
    public static BigRational SinPi(BigRational x);
    public static BigRational TanPi(BigRational x);
    public static BigRational Acosh(BigRational x);
    public static BigRational Asinh(BigRational x);
    public static BigRational Atanh(BigRational x);
    public static BigRational Cosh(BigRational x);
    public static BigRational Sinh(BigRational x);
    public static BigRational Tanh(BigRational x);
    public static CPU task_cpu { get; }
    public static int DefaultDigits { get; set; }
    public static int Radix => 1; //todo: check rational Radix ?
    public static BigRational One => 1u;
    public static BigRational Zero => 0;
    public static BigRational NegativeOne => -1;
    public static BigRational AdditiveIdentity => 0;
    public static BigRational MultiplicativeIdentity => 1u;
    public static bool IsZero(BigRational value) => value.p == null;
    public static bool IsNegative(BigRational value) => Sign(value) < 0;
    public static bool IsPositive(BigRational value) => Sign(value) > 0;
    public static bool IsEvenInteger(BigRational value);
    public static bool IsOddInteger(BigRational value);
    public static bool IsCanonical(BigRational value) => true;
    public static bool IsComplexNumber(BigRational value) => true;
    public static bool IsFinite(BigRational value) => !IsNaN(value);
    public static bool IsImaginaryNumber(BigRational value) => false;
    public static bool IsInfinity(BigRational value) => false;
    public static bool IsNegativeInfinity(BigRational value) => false;
    public static bool IsPositiveInfinity(BigRational value) => false;
    public static bool IsRealNumber(BigRational value) => true;
    public static bool IsNormal(BigRational value) => true;
    public static bool IsSubnormal(BigRational value) => false;
    public static BigRational MaxMagnitude(BigRational x, BigRational y);
    public static BigRational MaxMagnitudeNumber(BigRational x, BigRational y);
    public static BigRational MinMagnitude(BigRational x, BigRational y);
    public static BigRational MinMagnitudeNumber(BigRational x, BigRational y);    
    static bool INumberBase<BigRational>.TryConvertFromChecked<TOther>(TOther value, out BigRational result);
    static bool INumberBase<BigRational>.TryConvertFromSaturating<TOther>(TOther value, out BigRational result);
    static bool INumberBase<BigRational>.TryConvertFromTruncating<TOther>(TOther value, out BigRational result);
    static bool INumberBase<BigRational>.TryConvertToChecked<TOther>(BigRational value, [NotNullWhen(true)] out TOther? result) where TOther : default;
    static bool INumberBase<BigRational>.TryConvertToSaturating<TOther>(BigRational value, [NotNullWhen(true)] out TOther? result) where TOther : default;
    static bool INumberBase<BigRational>.TryConvertToTruncating<TOther>(BigRational value, [NotNullWhen(true)] out TOther? result) where TOther : default;    
  }

Documentation for the particular functions at GitHub

For best possible compatibility the high level functions like several Round, Truncate, Floor, Ceiling, a set of Log and Exp functions up to trigonometric functions Sin, Cos, Tan, Atan,... specials like Factorial etc.
These all in a static extra class MathR as mirror to System.Math and System.Numercs.MathF and similiar as possible.
This is to reflect the old known standards and to make it easy to change algorythms from for example double to BigRational or back, to check robbustness problems.

A look at the BigRational and MathR functions always shows the same scheme.
Gets the task_cpu and execute instructions on it:

    public static BigRational operator *(BigRational a, long b)
    {
      var cpu = task_cpu; cpu.push(b); cpu.mul(a); return cpu.popr();
    }

    public static BigRational Round(BigRational a, int digits, MidpointRounding mode)
    {
      int f = 1;
      switch (mode)
      {
        case MidpointRounding.ToZero: f = 0; break;
        case MidpointRounding.ToPositiveInfinity: if (rat.Sign(a) < 0) f = 0; else f = 4; break;
        case MidpointRounding.ToNegativeInfinity: if (rat.Sign(a) > 0) f = 0; else f = 4; break;
      }
      var cpu = rat.task_cpu; cpu.push(a);
      cpu.rnd(digits, f); return cpu.popr();
    }    

    public static Vector3R Cross(in Vector3R a, in Vector3R b)
    {
      // return new VectorR3(a.Y * b.Z - a.Z * b.Y, a.Z * b.X - a.X * b.Z, a.X * b.Y - a.Y * b.X);
      var cpu = BigRational.task_cpu;
      cpu.mul(a.X, b.Y); cpu.mul(a.Y, b.X); cpu.sub();
      cpu.mul(a.Z, b.X); cpu.mul(a.X, b.Z); cpu.sub();
      cpu.mul(a.Y, b.Z); cpu.mul(a.Z, b.Y); cpu.sub();
      return new Vector3R(cpu.popr(), cpu.popr(), cpu.popr());
    }

All calculation happens in the CPU object. The high-level API functions are all implemented as short or longer sequences of CPU instructions. It all breaks down in such sequences and this is in general short and fast code and allows a rich API with less memory usage. Some example client types using this system: ComplexR, Vector2R, Vector3R, Matrix3x4R, PlaneR, VectorR. A special is VectorR, This struct has also only one pointer containing a index and a sequence of the rational data structures and can therefore store lot of numbers very efficently. It is an example for client types taht it is easy using BigRational.CPU but independent from BigRational, using a private data format.

The CPU - a stack-machine

Like every stack-machine the CPU has instructions like push,push(double), push(int),... and pop's like pop(), BigRational popr(), double popd(),... self explaining instructions like add(), sub(), mul(), div(), mod(), neg(), abs(), inv(), dup(). But there is a difference:
Since we have numbers with arbitrary size mov() instructions are not efficient because it would imply many memory copies.
Instead of this it fast as possible to swap the stack entry pointer only since the stack is a uint[][] array. This explains the versions of swp() instructions.
A pop command frees nothing. It only decrements the index of the stack top and the buffer there keeps alive for next operations. Allocs are only necessary if the current stack top buffer is not big enough for the next push op but this is a successive rar case. Not only for the high-level API, the stack-machine itself uses the stack excessively for all internal interimes results and therefore dont need stackalloc or shared buffers. It forms a consistent system and enables fast and compact code without special handling.

The CPU class reduced to the public instruction set:

   public sealed class CPU
    {
      public CPU(uint capacity = 32)
      public uint mark()
      public void push()
      public void push(BigRational v)
      public void push(int v)
      public void push(int v, int n)
      public void push(uint v)
      public void push(long v)
      public void push(ulong v)
      public void push(float v)
      public void push(double v)
      public void push(double v, bool exact)
      public void push(decimal v)
      public void push(BigInteger v)
      public void push(ReadOnlySpan<uint> v)
      public void get(uint i, out BigRational v)
      public void get(uint i, out double v)
      public void get(uint i, out float v)
      public void get(uint i, out ReadOnlySpan<uint> v)
      public BigRational popr()
      public double popd()
      public int popi()
      public void pop()
      public void pop(int n)
      public void swp()
      public void swp(int a = 1, int b = 0)
      public void swp(uint a, uint b)
      public void swp(uint a)
      public void dup(int a = 0)
      public void dup(uint a)
      public void neg(int a = 0)
      public void neg(uint a)
      public void abs(int a = 0)
      public void add()
      public void add(int a, int b)
      public void add(uint a, uint b)
      public void add(uint a)
      public void add(BigRational a)
      public void add(BigRational a, BigRational b)
      public void sub()
      public void sub(int a, int b)
      public void sub(uint a, uint b)
      public void sub(BigRational a, BigRational b)
      public void mul()
      public void mul(int a, int b)
      public void mul(uint a, uint b)
      public void mul(uint a)
      public void mul(BigRational a)
      public void mul(BigRational a, uint b)
      public void mul(BigRational a, BigRational b)
      public void div()
      public void div(int a, int b)
      public void div(uint a, uint b)
      public void div(BigRational a, BigRational b)
      public void div(BigRational a, int b)
      public void sqr(int a = 0)
      public void sqr(uint a)
      public void inv(int i = 0)
      public void shl(int c, int i = 0)
      public void shr(int c, int i = 0)
      public void and()
      public void or()
      public void xor()
      public void mod(int f = 0)
      public void rnd(int c, int f = 1)
      public void pow(int x, int y)
      public void pow(int y)
      public void fac(uint c)
      public int bdi()
      public void lim(uint c, int i = 0)
      public int sign(int a = 0)
      public int cmpa(int a = 0, int b = 1)
      public int cmpi(int a, int b)
      public int cmp(int a = 0, int b = 1)
      public int cmp(uint a, uint b)
      public int cmp(uint b)
      public int cmp(BigRational a, BigRational b)
      public int cmp(BigRational b)
      public bool equ(BigRational a, BigRational b)
      public bool equ(uint a, uint b)
      public void norm(int i = 0)
      public uint hash(uint i)
      public uint msb()
      public uint lsb()
      public bool isi()
      public void gcd()
      public void tos(Span<char> sp, out int ns, out int exp, out int rep, bool reps)
      public void tor(ReadOnlySpan<char> sp, int bas = 10, char sep = default)
      public void sqrt(uint c)
      public void log2(uint c)
      public void log(uint c)
      public void exp(uint c) //todo: catch cases fast exp(1), ...
      public void pi(uint c) //todo: c = f(c), cache
      public void sin(uint c, bool cos)
      public void atan(uint c)
}

It is noticeable that there are many different versions of the same instruction. Since we have number values with arbitrary size it is all to avoid unnecessary memory copies. For example, to add two BigRational numbers it is possible to push both values to the stack, and call a simple add(). But since the data structure on the stack is the same as in BigRational pointers the CPU can access and read the data pointer directly and push the result only on the stack what saves two mem copies and is therfore faster. Other operation versions working with absolute stack indices what is more efficent in many situations. For example if one function pushes a vector structure and another function takes this as input.

The Efficiency

On an example. The CPU implementation of the sine function, which can also calculate cosines.
This function uses several identities and a final Taylor-series approximation.
The code looks ugly, but it reduces the computation to a sequence of just 36 short CPU instructions. It means that providing a rich set of such common math functions already at this level doesn't involve much overhead.

 public void sin(uint c, bool cos)
  {
    var e = bdi() << 2; c += 16; if (e > c) { } // bdi x
    var m = mark(); pi(e > c ? unchecked((uint)e) : c); shr(1); // push PI2
    if (cos) add(1, 0); // x += PI2 
    div(m - 1, m); mod(); swp(); pop(); // push divmod(x, PI2)
    var s = this.p[this.i - 1][1]; // dup(); var s = pop_int(); // seg
    mul(0, 1); neg(); add(2, 0); pop(); // x - trunc(x / PI2) * PI2
    if ((s & 1) != 0) { swp(); sub(); } else pop(); // x = PI2 - x, pop PI2
    if ((s & 2) != 0) neg(); lim(c); // x = -x, lim x
    push(6u); dup(1); dup(); sqr(); lim(c); swp(); // 3!, x^2, x
    for (uint i = 1, k = 4; ; i++, k += 2)
    {
      mul(1, 0); lim(c, 1); dup(1); div(0, 3); // x +/-= x^n / n!, n = 3, 5, 7, ...
      var b = -bdi(); if (b > c) break; lim(c);
      if ((i & 1) != 0) neg(); add(4, 0); pop(); lim(c, 3); // x += 
      push(k * (k + 1)); mul(3, 0); pop(); mul(1, 0); // n! *=, x^n *=
    }
    pop(4);
  }

To come back to the begin, the problem with BigInteger and a Rational type based on BigInteger. Imagin the same sine calculation with such type. Where every instruction involves not only two new full allocated BigIntegers for a result, Rational itself needs as example, for a simple addition 3mul+1div+1*add and this are all new allocated BigIntegers. But not enough, Rational has to normalize or round the fractions - additional many new BigIntegers. It should be obvious that this is inefficient, slow and leads to an impractical solution.

To show the differences I made the Mandelbrot example in BigRationalTest app. It does exactly this. A conventional BigRational class based on a BigInteger for numerator and denominator is part of the Test app and called OldRational. At the begin BigInteger profits as it can store small numbers in the sign field but then, deeper in the Mandelbrot set, it slows down exponentially before it stops working in a short break between endless GC loops.

image

The bottle-nack for rational arbitrary arithmetic is the integer multiplication and for the normalization the GCD function and the integer division. The first benchmarks showing that the BigRational calculation core is aproximatly 15% faster then the equivalent functions in System.Numerics.BigInteger. At least on my machine. With other words, using BigRagtional for pure integer arithmetics can improve the performance already. For integer algorithms using the CPU the effect is much bigger of course..

BigInteger in NET 7 will use Spans more stackalloc and shared buffers. This benchmarks i made with a NET 7 preview versions are showing that this reduces a little bit the memory pressure but further degreads the performance, especialliy for big numbers. It is probably the use of System.Buffers.ArrayPool.Shared what is not the fastest by a threshold of 256 for stackalloc. These are problems that are not existing for a stack-machine solution.

Latest benchmark results can be found Latest-BigRational-Benchmarks; (Same benchmarks with NET 7 preview, latest mater-branch a lost of around 10% performance) It is currently under construction, but I will be adding more tests daily.

Further development

There is great potential for further improvements. In the calculation core itself, in the algorithms. Furthermore, frequently used values ​​like Pow10, PI etc. should efficiently cached, numbers that are currently always new calculated for functions like Exp and Log; and this would give another big performance boost.

I'm looking for a solution using static AsyncLocal<CPU>? _async_cpu; instead of only [ThreadStatic] CPU _task:cpu; This would be more user friendly and save for await functions etc. but access to _async_cpu is 3x times slower, I need to find a trick to detect if a _async_cpu request is necessary, maybe a quick stack baseadress check or something.

The set of MathR functions should be extended and handle the DefaultPrecision property as it seems to work with a hypothetical floating point type that allows for such precision.

Especially geometric algorithms are very sensitive to epsilon and robustness problems. A vector extension, compatible with the current System.Numercis.Vector types would be useful. No one with BigInteger experiences would believe that it makes sense, that it's even possible, especially not in NET. But the Polyhedron 3D example shows: It works for thousands of BigRational Vertex, Points and Planes, for Polygon Tessellation, Boolean Intersections of Polyhedra (CSG) and this in real time at animation time:

BigRationalAnimation

Remarks

The current implementation is optimized for X64, tuned for RyuJIT code generation. For better performance on X86 it would be necessary to put special X86 code in critical pathes. Same goes for ARM, but I don't currently have a development environment for it.

I have created a C++ COM implementation of the CPU where the CPU instructions are COM interface calls. Unfortunately, for C# such a solution is about 30% slower due to the many platform calls. The same for a C++/CLI solution, which is also about 30% slower. But the COM solution inproc in a C++ app is almost 2x faster because of the powerful C++ optimization.

I checked the possibility of using AVX, SSE, Bmi intrinsics for optimizations. Unfortunately none of the test implementations were faster than the simple scalar arithmetic currently in use. Same experiences as published by many other programmers. Only Bmi2.X64.MultiplyNoFlags (MULX r64a, r64b, reg/m64) could really improve the performance but unfortunatlely, the corresponding ADCX and ADOX instructions are currently not implemented in Bmi2.X64.

From my point of view the best choice for an System arbitrary-precision number type is BigRational, not BigInteger. For BigRational all other types are only special cases. This applies for BigInteger, BigFloat, BigDecimal, BigFixed, in general floating-point types in any fixed or arbitrary size. Finally these are all rational number types, only with restricted denumerators, BigInteger 1, BigFLoat or in general floating-point 2^n , BigDecimal 10^n and so on. It is easy and efficent to implement such types very compact based on the BigRational CPU. The CPU has all the instructions that are necessary to calculate for the specific number types, otherwise it's easy to add. No problem to use own data formats as shown in example: VectorR. Only one core is necessary and the resources can shared for several client types. Not necessary to implement every time again a new calculation core. The focus should be on highly optimizing this CPU core for all possible platform- and processor configurations, and all client types would benefit.


More technical details, example codes etc. can be found at:

c-ohle commented 2 years ago

may require multi-threading

There are so many multi-threading examples in the test app alone.. Starting with Mandelbrot, The 3D animations, most of the realtime (at animation time) mesh calculations ranning multithread, model building from tesselation extrusions, text objects etc up to the really expensive CSG ops. This is all BigRational, testing the VectorR - Vector interaction becouse I use your Vectors in the 3D engine.

c-ohle commented 2 years ago

(And the 3D engine is also self written as you can check.)

c-ohle commented 2 years ago

And would we get only little support from the compiler devs. That it is not more necessary to have such "or" expression. All problems would be solved and also BigInteger cuold easily 10 times faster in expressions with only some lines code as I can show in my implementation. This support is not a big logical impact in the system. It must not even any bublic new future. It could be a simplle call at start and end of a parse sequence to special name marked notify function and it would be done. This would help for so many other solutions too, from string building up to classes that can reset her caches in a determinated way. It works also over function call, I found ways that it works withoiut problem together with debug visualizer and and and

c-ohle commented 2 years ago

And it is super safe in the meaning, you saw it byself, the upcoming issues with such builder in expressions.

c-ohle commented 2 years ago

[SpecialNameAttribute]
private static void op_Help(bool begin) { }
tannergooding commented 2 years ago

I'm not saying that the CPU setup doesn't work as intended nor am I saying that it isn't performant. I'm saying that it doesn't look like a good fit for the BCL. There are many reasons iterated for this above including the complexity, the new pattern, the naming conventions used, concerns around usability and limitations enforced due to its use of AsyncLocal or ThreadLocal data, etc.

Types, especially number types, provided by the BCL should generally be basic building blocks on which other devs can build their own algorithms and functionality. They should be simple/easy to use, they should meet existing expectations of numeric types, they should play well with existing algorithms, they should play well with the new generic math feature, etc. Such types ideally expose properties and fields that users will expect on the type. For example, a Rational like number is typically expected to directly expose a Numerator and Denominator property and since they are properties accessing the data should typically be cheap. A BigFloat or BigDecimal like type however has no such expectation, its only expected that the former behaves like other IEEE 754 types and the latter like an arbitrary length base-10 type.

After all the "basic points" around usability and meeting user expectations are considered, then we can take a look at perf and how we can internally optimize or provide an additional set of types/APIs to accelerate the base type. For BigInteger, the biggest cost today is that around long chains of expressions resulting in many temporary throwaway buffers. This is essentially the same "cost" that exists for other "large immutable" data, such as ImmutableArray<T>, or even "dynamically sized" data, such as List<T>. One way to help solve that is improved pooling usage another is to provide a mutable version of the type that can be incrementally built on before it is "frozen" (i.e. the Builder pattern). There are of course other options as well such as your CPU based approach or building expression trees and having the "execution" of the expression tree do optimizations to minimize the total work done.

Each of these approaches has differing upsides and downsides and its up to the BCL to pick one that can be exposed to users such that they have the highest chance of success and where success is defined using multiple metrics, not just having the code be as fast as it can be. Having the code be understandable and maintainable over time tend to be more important overall, particularly in large scale applications and services.

I am more than happy to look at providing a Rational like type to help cover the user needs here, but the backing implementation for it needs to consider all the things detailed above as well as other considerations dictated by the Framework Design Guidelines or API review. That's what I'm ultimately trying to push and drive towards here, is reaching a common ground where we can define a type that API review will approve and that can be reasonably implemented and then used by customers so they can achieve success.

c-ohle commented 2 years ago

I agree, and I've been working all day (we have 18:00) to find a solution for exactly this. This is maybe more important than fine tuning in the algorithms what I would prefer todo.

c-ohle commented 2 years ago

You are maybe scaptical and you have to be skeptical. But all the criteria we spoke about, security, conventions, styling rules up to details like T op<T, T>... (change naming conventions and improve protection no problem) I know that the current approach completely fullifies this. Epecially also in the expressions... If I could find a solution to avoid the "or" trick and also this with a clean and save solution.

c-ohle commented 2 years ago

I assume you have something like a test environment for INumber. There is a NuGet package: "BigRationalNET7preview" (link) You could check and just for fun (the or operator is still in) a same calculation with x = 0 | .... The effect is amaizing.

tannergooding commented 2 years ago

Relying on tricks like that is a non-starter. It will never pass API review.

c-ohle commented 2 years ago

it's a trial

c-ohle commented 2 years ago

@tannergooding With request to check. String formatting - does it fulfills the requirements?

Standard numeric format strings (link used as spec)

Custom numeric format strings (link used as spec)

Default ToString()

Of course, all this can be changed quickly. Full custom format support could be included. The BigRational.Parse versions read all specials without extra settings back. (repetition form, "Q" format). All in detail + examples

c-ohle commented 2 years ago

@tannergooding update

c-ohle commented 2 years ago

@tannergooding please - with request to check. A new Builder solution for BigRational and BigInteger - but does it fulfills the requirements?
(It's also a perfect substitute for the "OR" trick.)

How it looks to the user:

  BigRational a, b, c, d, x;
  x = a * b + c * d; // standard notation, 3x alloc/normalization
  x = (BigRational.Builder)a * b + c * d; // one final alloc/normalization, up to 3x faster
  x = (BigRational.Builder)a * b + (c - 1) * BigRational.Sin(d); // parentheses and function calls inclusive.

To test and show I added a small BigInteger class called BigInt (code) where it works exactly the same:

  BigInt a, b, c, d, x;
  x = a * b + c * d;
  x = (BigInt.Builder)a * b + c * d;
  x = (BigInt.Builder)a * b + (c - 1) * BigInt.Pow(d, 100); 

So far it should conform to NET standards, except that it also works for parenthesized expressions and function calls.
However, this is easy to change, but would require writing more builder casts explicitly, which quickly becomes confusing.

How it works:

The implementation: (complete code)

    public readonly ref struct Builder
    {
      readonly BigRational p; // the one and only field

      public static implicit operator Builder(BigRational v); // set of implicit's
      public static implicit operator Builder(long v);
      public static implicit operator Builder(double v);
      //...

      public static Builder operator +(Builder a, Builder b); // set of operator's
      public static Builder operator -(Builder a, Builder b);
      public static Builder operator *(Builder a, Builder b);
      //...

      public static implicit operator BigRational(Builder v); // finally, "ToResult()", as implicit cast
   }

Almost identical, the code for a BigInteger solution.

Remarks

tannergooding commented 2 years ago

An expression x = a * b + c * d is really:

BigInteger t1 = a * b;
BigInteger t2 = c * d;

x = t1 + t2;

Inserting a cast such as the following is problematic for a few reasons:

x = (BigInt.Builder)a * b + c * d;

To start, this only effectively changes the expression to be:

BigInteger.Builder t1 = new BigInteger.Builder(a);
t1 *= b;

BigInteger t3 = c * d;
x = t2 + t3;

It cannot optimize or handle t3 because there is no way to access the originally created builder. You cannot statically or globally cache a backing "stack" because there is no guarantee that some expression t1 = a * b and some expression t2 = c * d are sub-expressions of the same higher expression. They could be completely unrelated expressions with the user interleaving operations.

Additionally, casting from a T to a T.Builder is not really a desirable operation. This conversion has cost and will allocate/copy because a Builder needs to assume it can mutate the backing array, so it can't just take the data "as is". We typically want to have such costs more readily visible and having constructors or other patterns for creating the builder from a T tend to be better. Having operations that do T.Builder op T is fine however because the RHS is only used as readonly data.

I strongly believe that the stack based approach is not the right approach here. If publicly exposed, it forces users into a more complex pattern and while there is indeed some benefit to such an approach, the complexity cost for something that is meant to be a "simple to use number type" is too high. If used as an internal implementation detail, it becomes far too restrictive on what developers can do. Both of these lead the common usage scenarios into being pits of failure.

Because of this, such functionality would be better suited to a third-party package so that the more advanced/complex path is opt-in and only for when the user really needs it.

c-ohle commented 2 years ago

@tannergooding thanks for response.

No big discussion is necessary at this point. If it's not good enough, I'll have to find a better solution. More interesting, the other general questions - string formatting - the other Builder type options. What do you think?

Just as note:

BigInteger.Builder t1 = new BigInteger.Builder(a);

It is simply impossible to construct such cases because there are neither constructors nor functions. Operators only, but using this we're back to standard notation and everything is fine and the behavior is clear defined.

It cannot optimize or handle t3...

Regardless of parentheses and function calls, the compiler always generates a flat sequence of operator calls when parsing. The explicit cast from number to number.builder and implicitly back is always the first and the last event in the sequence. The problem is rather the opposite - the operator has no reference to parentheses - difficult to switch the optimizaton of.

no guarantee that some expression t1 = a b and some expression t2 = c d...

Therefore not possible without a cast: t1 = (Builder)a b; The case: builder t1 = (builder)a b; is something that looks dangerous, but improper use can be detected at runtime.

This conversion has cost and will allocate/copy...

Since it is a ref structure, it costs nothing, in fact only the operator calls are needed as an event. But of course it's helpful to host a reference to the value for quick access.

tannergooding commented 2 years ago

It is simply impossible to construct such cases because there are neither constructors nor functions

I called out that this isn't the type of scenario where a cast is desirable due to the hidden cost. The standard pattern is for a conversion via constructors/functions.

Regardless of parentheses and function calls, the compiler always generates a flat sequence of operator calls when parsing

That is an implementation detail, not a guarantee. You have an expression that results in at least two intermediates needing to exist (one for a * b and one for c * d) before the addition can occur. Whether these intermediates exist in a register, in a local, or in some stack (concrete or virtual) is an implementation detail and the IL happening to itself use a "stack based" approach to represent these temporaries is often not how it is compiled down in practice.

Therefore not possible without a cast: t1 = (Builder)a b; The case: builder t1 = (builder)a b; is something that looks dangerous, but improper use can be detected at runtime.

Detecting improper use at runtime, rather than compile time, is often viewed as a "pit of failure". It means that it is much easier to miss an issue or introduce a bug.

Since it is a ref structure, it costs nothing, in fact only the operator calls are needed as an event. But of course it's helpful to host a reference to the value for quick access.

The ref struct doesn't have overhead but it must track the "storage space" for the computed values. Due to this being "big data" and the builder needing to grow to fit the size of the result, this must be either some GC or Native memory allocation (with it likely being a GC allocation due to the potential pit of failures around handling Native allocation lifetimes).

Every builder is, therefore, at least 1 pointer (4/8 bytes) in length. Realistically, it is likely 12-16 bytes instead, as we typically need to track the backing data, the length, and some flags. You can of course do tricks like tracking just a byte[] and having the length, flags, and data all be "inline" (much like string/array do internally), but that also makes things quite a bit more complicated to implement/maintain.

Every builder likewise needs to support "growing" the backing data for the array. This means that you cannot (without flags tracking that the data cannot be mutated) do a "zero cost" conversion of a T to a T.Builder. Likewise, since a * b is likely going to produce something that is bigger than either a or b you often need another allocation anyways. Unlike the T, the T.Builder however can be allowed to track a "too big" allocation and therefore you can avoid this additional allocation by presizing the builder to be "big enough" (much as you can do with a List<T>).

There are many considerations that have to be taken into account around usability, where allocations come into play, where those allocations actually have cost vs where they won't show up, etc.

c-ohle commented 2 years ago

@tannergooding

The standard pattern is for a conversion via constructors/functions

Yes for real Builder solution and yes the name Builder for this thing is misleading. I had no better idea how to call it. "BuilderActivator", "BuilderDelegator", "TempValue", ...

That is an implementation detail, not a guarantee...

The approach is completely independent from this since all depends on the initial- and the final cast and this must happen first and last. It's actually only a safe workaround to simulate the missing op_Assign operator. Also the operators and conversions, there is no science behind, it is only to provide a NET conform and safe interface. Something better than the "OR" trick.

The ref struct doesn't have overhead but it must track the "storage space"

Not really, as ref struct the life time is short, the buffer or builder is there anyway and only needed for back cast's. There is no need for further information like flags etc. And as you mentioned, we can not relay on this and this is exactly the reason why I did not such things in the example implementations.

Every builder likewise needs to support "growing" the backing ...

Behind it remains all the same. It is all only to avoid unnecessary mem allocs, copy's and normalizations.

tannergooding commented 2 years ago

The approach is completely independent from this since all depends on the initial- and the final cast and this must happen first and last.

The point is that it is functionally equivalent to the example I gave and so there aren't "enough" casts/conversions. A second builder may be needed for c * d if you want to reduce the allocations as much as possible.

Not really, as ref struct the life time is short, the buffer or builder is there anyway and only needed for back cast's.

You must have a storage space to track the result. Given a * b you must have a destination buffer that is at least a.Length + b.Length in width. In order to reduce allocations across many operations you must be able to use something as both a source and destination, therefore mutating it. This generally also necessitates "overallocating" (that is allocating a buffer that is larger than the stored data is big). This in turn needs you to track the amount of the allocated buffer that is actually in use. If you want creating a T.Builder from an existing T that also necessitates tracking a flag that means "I can't mutate this and must allocate".

Behind it remains all the same. It is all only to avoid unnecessary mem allocs, copy's and normalizations.

I was trying to elaborate that the ref struct only removes an unimportant and unexpensive short lived allocation. While removing such allocations can sometimes pay off, it will not meaningfully impact most operations and comes with a disadvantage around utilizing the builder in more complex scenarios (including around async code). The cost around things like BigInteger today are around repeated allocations and copying for complex expressions. Its the same cost that comes from allocating a List<T> with a capacity of 4 and then adding 1000 individual items to it. You end up resizing the array and copying the data around 9-10 times, where presizing the capacity to 1000 means you remove all that overhead.

c-ohle commented 2 years ago

@tannergooding (the movs between calls removed)

x = (BigInt.Builder)a b + c d; 00007FFD7B16B566 call System.Numerics.Rational.BigInt+Builder.op_Implicit(System.Numerics.Rational.BigInt) (07FFD7B62C030h)
00007FFD7B16B580 call System.Numerics.Rational.BigInt+Builder.op_Multiply(Builder, System.Numerics.Rational.BigInt) (07FFD7B62C0D8h)
00007FFD7B16B59A call System.Numerics.Rational.BigInt.op_Multiply(System.Numerics.Rational.BigInt, System.Numerics.Rational.BigInt) (07FFD7B620AC8h)
00007FFD7B16B5B4 call System.Numerics.Rational.BigInt+Builder.op_Addition(Builder, System.Numerics.Rational.BigInt) (07FFD7B62C0A8h)
00007FFD7B16B5C7 call System.Numerics.Rational.BigInt+Builder.op_Implicit(Builder) (07FFD7B62C138h)

c-ohle commented 2 years ago

@tannergooding Please, this is only a option to provide more performance. The solution itself does not depend on it. The other 3 Builder types - the questions regarding string formatting, also questions regarding INumber, this would be helpful.

c-ohle commented 2 years ago

@tannergooding To make sure I can give verified statements I checked everything again, latest NET 7 code BigInteger, ArrayPool<>.Shared etc. even latest Roslyn code.

The ref struct Builder is gone. For realistic tests and benchmarks I started add a new external type BigInt (sourcecode) that will get exactly the same functions and interfaces then System.Numerics.BigInteger.

In principle there is no difference. static class BigIntegerCalculator uses ArrayPool<>.Shared memory - SaveCPU, should been seen as a black box or VM and is a shared object by self. The CPU internal buffers are private, used as work memory, allocated on demand. BigInteger and BigRational, both systems using ThreadLocal static root(s), however ArrayPool<>.Shared.Rent() is measured min 5x slower then a simple ThreadStatic field access and BigIntegerCalculator need many rent's where the CPU has the private buffers for this.

Where is the problem with BigRational memory management?

That and the efficiency of the systems already make a difference in average performance of over 20%. This for simple integer arithmetic - for rational arithmetic the effect is much bigger. With 0 | (a * b + a * b - a + a * a) / a; the same benchmark shows 70% but this only as info about what would be possible.

Formula to have all bottleneck functions (incl. a sqr) in: (a * b + a * b - a + a * a) / a; (all latest benchmarks) image

acaly commented 2 years ago

Maybe not related to this issue, but BigRational is not the best choice in calculating Mandelbrot. We need a BigFloat, which is just extending from double further to have more significands.

Thaina commented 2 years ago

I don't think it's possible to have constant number of variable for irrational number. For rational it possible for just 2 number so we can make a struct out of it. But irrational realm can be chained and so I think it not worth to support them in language

c-ohle commented 2 years ago

@acaly yes of course. mainly intended to get a test case with massive calculations and a width range of number complexity and to test thread safety. But there is another aspect. BigRational acts like a rational number. The CPU however is a general solution with very special bit level operations and can emulate integers and floating point arithmetic. There are for example fast binary round ops to emulate floating-point behavior: forget the last bits. Tools needed for example in Taylor series where interime results quickly would get much more precision then then necessary for the result precisission. For rational all other number types are only special cases, Integer: denumertor=1, for FP den = 2^n, decimal 10^n etc. The CPU however can handle all cases, also and, or, xor, etc. can also emulate non existing FP types with short seqences of cpu op's.

c-ohle commented 2 years ago

@Thaina The day someone invents an irrational type of number, I will delete the repository. Yes of course, it's more to have a arbitrary number type for at first, precision in algorithms sensitive for epsilon and robustness problems, geometrics cut of lines, planes etc, intersections always perfect exact - fp can only estimate and wrong estimation and the algorythm fails. Such things. Other case is, when it strarts with irratonals, to have a solution to calculate in any desired precision range to emulate FP with much higer precision. So much possebillities. The cpu has already support for Int256, Int512, Int1024,... one fast op for all int/uint conversions that have the same cheme. So much possebillities with minimal code.

theuserbl commented 1 year ago

I want a BigDecimal like this one https://github.com/AdamWhiteHat/BigDecimal

But with more functions than pow(). It needs also a BigDecimal of Pi, e, Sin(), Cos, Tan(), etc.

c-ohle commented 1 year ago

@theuserbl Hi, Meanwhile, BigRational supports all math functions. Results rounded to appropriate decimal places and it matches BigDecimal numbers and their behavior with better performance and memory efficiency.

An IEEE-754 value struct template Type: Decimal<T> is under development. Like the already implemented Float<T>, Int<T>, UInt<T> ... Decimal<T> will support fixed sizes eg. 64, 128, 256,... bytes with appropriate precision. The advantage of these types is that no GC allocations are necessary at runtime. The BigRational calculation core is used and so the effort for the templates is minimal.

theuserbl commented 1 year ago

@c-ohle Nice. But why haven't you added a pull request here? Isn't it better to have BigRational integrated in .NET and possibly later in PowerShell ?

Possibly the .NET maintainers have some terms, like that "Copyright (c) 2022 Christian Ohle" will be changed to "Copyright (c) .NET Foundation and Contributors". And possibly a rename to BigDecimal. Because that makes more sense. There existing Integer and BigInteger. But with BigRational, there dont existing a normal Rational. But .NET have a normal Decimal.

tannergooding commented 1 year ago

As per our API review process, all new API surface area must first go through and be approved by the API Review team: https://github.com/dotnet/runtime/blob/main/docs/project/api-review-process.md

As per the discussion above, the area owners do not feel the proposed surface area/implementation are the right fit for inclusion in the BCL itself.

There is some interest in providing some type of BigDecimal/BigRational, but it needs to fit in with the .NET Framework Design Guidelines and other types exposed by the BCL in the numerics area.

c-ohle commented 1 year ago

@tannergooding Hi, something completely new is coming soon. If you want we could work together.

c-ohle commented 1 year ago

@tannergooding a new BigRational class - in short:

BigRat.cs (single file solution, just copy and try)

**Would that meet the requirements?***

New approach:

This as a reference version. Major improvements/optimizations are possible. Of course, the stackallocs will be secured. And a cache would bring other big performance improvements, especially for the irrational functions.

theuserbl commented 1 year ago

@c-ohle Nice. But I still think, that it is the wrong place, to publish the code.

The right place is, to add the code per "Pull requests" and not posting it only under "Issues".

tannergooding commented 1 year ago

Putting up a PR is not the right approach. All new APIs must first have their open proposal which is accepted by the area owners and then taken to and approved by API review. If all that happens, then a PR is allowed to be put up.

The entire process required is seen here: https://github.com/dotnet/runtime/blob/main/docs/project/api-review-process.md

I'm still not a fan of the overall approach being used and don't believe it fits or considers the needs/scenarios at large. It has taken into account some of the more fundamental issues that were raised initially. However, the code as is not super readable, which makes it difficult to review and determine if it is acceptable or not (lots of one liners and other things that violate the general coding style/rules for the repo): https://github.com/dotnet/runtime/blob/main/docs/coding-guidelines/coding-style.md

If it were in box, it would most likely not support downlevel frameworks. It would be .NET 8+ exclusive and the names of APIs would need to follow the general Framework Design Guidelines around naming: https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/naming-guidelines

Not supporting concepts such as Infinity immediately cuts out a number of core scenarios. The same could be said for NaN, but that is somewhat more debatable as the concept that NaN supports could be handled a bit differently instead. The handling for irrational values also exposes various complications and makes the use of the APIs potentially non-deterministic or lossy depending on the scenario in question.