fredrik-johansson / fungrim

Fungrim: the Mathematical Functions Grimoire
MIT License
113 stars 15 forks source link

Verification of formulas #13

Open certik opened 5 years ago

certik commented 5 years ago

Right now each formula has to be manually tested, and it's actually very easy to make a typo, and not easy to quickly see if the formula is at least roughly correct.

It would be nice to have a module in Fungrim or SymPy that will test the formula. For a start, let's say we want to test the formula implemented in #11.

One issue is that such tests might be time consuming, and so they probably should not be run by default on a CI, but one should be able to manually run them on the command line.

fredrik-johansson commented 5 years ago

This is exactly what I want to do.

SymPy could be used, but I suspect it's not hard to do it from scratch. All that is needed after plugging in the random values is to simplify a constant symbolic expression, which is a lot easier than simplifying with variables. Numerical comparisons could be done with Arb (better when possible) or mpmath.

The most elegant solution would be to start with the formula, plug in random values, and have an algorithm that automatically finds other Fungrim formulas to apply until the predicate in the formula has been simplified to a boolean value (hopefully True!). But that's definitely a challenge (probably a hard research problem) to do well.

I agree that such tests will be time consuming, and testing entries manually will be necessary. With a compiled version of the database and a fast C formula backend, running a small number of test iterations for each entry will perhaps be feasible.

asmeurer commented 5 years ago

Some of the SymPy tests would be testing SymPy just as much as fungrim (not necessarily a bad thing), because SymPy would just implement the formula directly. In other cases, such as integrals, SymPy would compute the formula from some lower level algorithm, so it could be trusted better. One can also use heuristics like computing power series of both sides, or differentiating both sides, etc. (the exact applicable heuristics would depend on the formula), and seeing if you get the same result after simplification. Since it generally takes different code paths to compute the series expansion, say, of two different types of expressions, if you get the same result, that's a good indication that it is correct, since a bug in SymPy to produce that would have to be coincidental. It's not a proof, obviously. If you want a proof, you have to set up all the formulas in a theorem prover, which would be a lot of work.

For the third idea, getting it to do it automatically is very hard problem (you are basically building an automated theorem prover). But if you can provide it with the substitution steps manually, and then have it verify the proof, it is much more tractable.

A good first step would be to have SymPy code for each formula. The C testing can use SymPy's code generation. SymPy may do some unwanted automatic simplification on some formulas, so that would need to be handled somehow.