Arbitrary Precision math

edyoung commented 4 years ago

Allow zooming further than limits of double-precision math. Requires using a bignum package (likely GMP mp_z) throughout.

mindhells commented 4 years ago

Hey guys, I've added this example/experiment about "how to tackle arbitrary precision support": https://github.com/HyveInnovate/gnofract4d/commit/edeea3874bda5929598b37cf8bbaaf1e782c5b69 What do you think? @edyoung @DX-MON I've received a lot of useful feedback from you in the my last PR, so if you are curious and have the time I'd highly appreciate your feedback on this. We have some additional discussion here, in case you need some context: https://github.com/HyveInnovate/gnofract4d/issues/7

dragonmux commented 4 years ago

Certainly - I've left some comments against that commit's diff with my standardese hat on. Most of it looks really good though with some nice attention to detail.

The comment about destructors, by way of further explanation, is caused because if a destructor is not marked virtual, and someone does derive the type, the standard says that you just invoked lifetimes UB as only the destructor for the type being used to refer to the instance will fire - virtual fixes this as does marking the type final.

mindhells commented 4 years ago

Certainly - I've left some comments against that commit's diff with my standardese hat on. Most of it looks really good though with some nice attention to detail.

The comment about destructors, by way of further explanation, is caused because if a destructor is not marked virtual, and someone does derive the type, the standard says that you just invoked lifetimes UB as only the destructor for the type being used to refer to the instance will fire - virtual fixes this as does marking the type final.

Thanks a lot for your comments. I can only agree on all of them.

About making the class final or not: My idea for this experiment is to lay out a kind of framework or interface to switch between different arbitrary precision libraries (mpfr in this case) keeping the formula code "the same".

About the comments on the formula loop/interface: to be honest I didn't pay much attention to the original code generated by the compiler... I just replaced the double type for MpDoble. Still you feedback is very useful to be applied to the fract4d_compiler package. (actually the initial idea is not to have to change the compiler ... since it's like a black box for me now)

What do you think about this strategy (new type with operator overloading)? Do you have another approach in mind?

dragonmux commented 4 years ago

I think it's the right thing to do as it turns the C API into a neatly encapsulated C++ type with proper RAII semantics for managing the lifetime of the underlying C multi-precision type, resulting in good, correct, fast code - which is really all anyone can ask for :)

Additionally, if you'd like to be able to hot-swap underlying implementations, then that invites a pure-virtual type that defines the API, with final-marked implementations we can then hot-swap - this keeps the code both flexible and performant as any use of a concrete type that's also final-marked, won't use the vtable so you don't pay the cost for the flexibility

josecelano commented 4 years ago

Today @mindhells @guanchor and I were talking about different strategies to implement this feature. One of them it's the one that @mindhells has explained and implemented in that example.

The main idea is: move the formula to C++ and introduce a new abstraction layer with a "smart" double type.

Pros and cons for that approach could be: pros:

It seems changes in the compiler are going to be minimal.
It could be easier to switch between standard precision and arbitrary precision.

cons:

Performace? As far as I understand @DX-MON is saying it should not be a problem implementing the formula in C++ instead of C. Something we are still curious about is why at some point the language for the formula was not changed to C++ since most of the code for the engine is written in C++.

Another idea could be: keep the formula code simple and move the hard work to the compiler. In this case, the compiler has to compile two versions: the simple precision and the arbitrary precision version. The formula will continue to look like a "compiled" code. I mean right now the output is like a stack-oriented programming language.

pros:

C formula faster?

cons:

2 compiled formulas
more complex compiler
harder to implement?

Anyway, regardless of the approach I think one of the first things to do is try to find a good C library for arbitrary precision. Or at least one than can be used to implement all the complex operations implemented by the fractal language. I have also implemented another example in C using a higher-level library which uses also mpfr package under the hood:

Arb dependencies (http://arblib.org/):

Either MPIR (http://www.mpir.org) 2.6.0 or later, or GMP (http://www.gmplib.org) 5.1.0 or later.
MPFR (http://www.mpfr.org) 3.0.0 or later.
FLINT (http://www.flintlib.org) version 2.5 or later.

This is the example I wrote: https://github.com/josecelano/c-mandelbrot-arbitrary-precision Please, don't be cruel to me, I'm newbie in C.

dragonmux commented 4 years ago

I have used GMP and found it to be Not Bad to work with in C++ with a properly written wrapper.. it's available by default on all major Linux distros because of programs like GCC.

I would suggest that, as long as you don't mind it being a little more low-level.. MPFR is a good choice too for the same reasons - it's already required for compilers and various other code bases to work.

This should reduce the effort required to keep a Mac OSX and Windows build in working order while making it mostly 0-effort for users on Linux.

edyoung commented 4 years ago

A couple of historical notes.

The formula is compiled to C instead of C++ because it didn't really require any C++ functionality, using C just as a 'portable assembly'. I had in mind that at some point it would be interesting to target something else (like GPU) so the idea was to have the backend produce code that required minimal cleverness to compile. If I were to work on it myself I would probably take the approach of having the python compiler backend generate different code for arbitrary-precision math; But if you want to tackle this in any direction, I'm fine with it :-)

It's also worth noting that plain C compiles much faster than clever templatized C++, which becomes relevant because this is a JIT compiler; we recompile every time a user changes a function parameter, for example. It recommend trying out compilation time for a largish formula ( like some of the more elaborate coloring algorithms in standard.ucl) to check if the compile time becomes an issue.

Also on the C vs C++ front, note that the interface between the dynamically-loaded code and the fract4d lib was deliberately extern "C" to avoid issues with users having a different C++ compiler version than the one I compiled with . ABI standardization may noiw make this a moot point.

From a perf point of view, one nice thing about the current generated code is that it does no memory allocations or function calls. This will be trickier to do with an MP math library.

I have not tried the different libraries. Doesn't look like MPIR provides a floating-point type though. Also FLINT appears to use http://arblib.org/ for floating point.

mindhells commented 4 years ago

I'm betting to the C++ as yet. Maybe because I see modifying the compiler too complicated. The idea behind the experiment I did is: I keep the code generated by the compiler just the same but replacing the double type.

I think @josecelano it's betting against me :D

Thanks to your comment @edyoung now I see some points to work on, to prove or dispose this approach:

compile times: I have to compare the compile time to argue with data. On the other hand if we find this is a problem maybe we can look for improvements here (maybe there's a way to avoid compiling after a function parameter change: what if we have all the functions available already compiled and loaded like fract4d_lib? would that be possible?)
compatibility: to be honest I have little to no idea about ABI standarization, I'll try to investigate a little about this. On the other hand: can we somehow establish a support list for environment and compilers? how limiting would you think that be for the compatibility?
performance of the formula itself: I don't think either of the 2 approaches would give us a big advantage here. I can tell the experiment I did it's 15 times slower than the original formula and that I guess it's because of the MPFR library doing allocations. Of course there should be a lot of improvements we could do but this is going to be much slower, so here makes a lot of sense to use arbitrary precision only when needed. That means either you do 2 compilations or a single smart one (that can switch from arbitrary precision support to no-support).

One last though: If you see the experiment I did, there's a new type MpDouble, which is a wrapper for the library that provides AP. This wrapper doesn't need to be compiled along with the formula but could be compiled beforehand (in the setup) like the fract_stdlib instead

dragonmux commented 4 years ago

With regards to ABI standardisation:

Clang and GCC have a moratorium to allow a C++ library compiled by GCC to be used in a Clang library to then be used in a GCC-compiled executable.. and GCC likewise understands and abides by the MSVC ABI on windows.

It is not strictly standardised however, and you will not be able to use LTO objects in the process as GCC doesn't understand LLVM IR, and Clang doesn't understand GIMPLE. But, we can depend on this non-LTO behaviour as it is an explicit compatibility goal with the compiler projects.

This said, LTO objects are fine if you are building a .so with one compiler as the link phase transforms all objects into machine-specific object code.

mindhells commented 4 years ago

@DX-MON I think I get it, but just to be sure: How is gnofract4d currently achieving that compatibility using "extern C" ? Is that because ABI for C is platform specific? On the other hand, I understand if we make the fract4d_compiler package use the same compiler as the setup does (through distutils) it shouldn't be a problem. Which "flags" can be "dangerous" in this case?

dragonmux commented 4 years ago

Any symbol in C++ is allowed to be marked extern "C" to provide it C linkage (no name mangling - used to encode type information in the symbol) as long as the symbol remains unique and does not violate the ODR (One Definition Rule).

This affords gnofract4d the ability at the moment to not worry about what compilers were used because the C ABI is strongly defined - printf() will, regardless of parameters or return type, be simply printf in the symbol table for example.

However, this comes with some notable downsides: It is now on us to ensure symbol uniqueness as in C as even a symbol from a namespace will have that namespacing stripped to make it a C symbol. The compiler and linker are now no-longer able to properly handle overloads on C-exported symbols or provide us end-to-end type safety.

da2ce7 commented 4 years ago

Kalles Fraktaler 2 + has good support for Arbitrary Precision Math, maybe worth looking into.

edyoung commented 4 years ago

Thanks! You can see which libraries they are using here: https://code.mathr.co.uk/kalles-fraktaler-2/blobdiff/1916d62e2efa875370c7cdd79e105846b80e228e..c8b44ac7b2a19d778ab9359b2730fce8b718c4d4:/prepare.sh

josecelano commented 4 years ago

The XaoS project has some interesting ideas about "arbitrary precision" on https://github.com/xaos-project/XaoS/issues/24

fract4d / gnofract4d

Arbitrary Precision math #77