data-apis / array-api-tests

Test suite for Python array API standard compliance
https://data-apis.org/array-api-tests/
MIT License
65 stars 42 forks source link

Compare numerical functions against an arbitrary precision library #7

Open asmeurer opened 4 years ago

asmeurer commented 4 years ago

See https://github.com/data-apis/array-api/pull/29.

We could compare numerical functions against an arbitrary precision library. The question is how off something should be for it to be a failure, but at least we can report the largest deviation (hypothesis makes this straightforward).

mpmath is a good option, as it is pure Python and well tested as it is used inside of SymPy.

One technical issue is that mpmath's arbitrary precision floats have infinite range, unlike machine floats which overflow and underflow. As far as I can remember, that is the main difference between an mpmath.mpf with dps=15 and a machine float, but there may be other differences as well that I'm not remembering.

kgryte commented 4 years ago

As discussed in the proposal, except for base cases or operations whose accuracy is explicitly mandated for IEEE 754 compliance, we cannot fail an implementation based on how far off that implementation is from, e.g., an arbitrarily precise result. We can only report and allow array libraries to compare their results against other array library implementations. Such reporting tooling could be made independent of the test framework.

asmeurer commented 4 years ago

Does this mean that transcendental functions shouldn't be tested at all, aside from the specific input/output pairs listed in the spec? If sin(pi) returns 2.0, that's accurate to within 1e2.

I think the test suite is going to need some reporting tooling anyway, so I think this can be useful. We would need some way in the reporting to not just have "pass" or "fail", but for numerical functions, list how accurate they are. Also, hypothesis has a nice trick that Zac showed me on our call that allows it to find the highest deviation (so instead of just saying "the tests found an x where sin(x) is off from the best value by 1e-15", it can say "the tests found the value x that maximizes the difference of sin(x) from the best value").