cmpute / dashu

A library set of arbitrary precision numbers implemented in Rust.
Apache License 2.0
74 stars 9 forks source link

List of operations that produce NaN? #29

Open kameko opened 1 year ago

kameko commented 1 year ago

The documentation for dashu_float says that operations that produce NaN will panic. I'd like to be able to check for such conditions before performing those operations, so I can catch them and return Err instead. However, I don't know all possible operations that can panic, and the source code doesn't make it obvious. I'm checking for the obvious right now in my code, checking for any operations that work with infinity, etc.. But I think the documentation should specifically list every operation that can panic.

Additionally, it would be nice if the standard library actually had some kind of would_panic or results_in_nan function to explicitly check that for you, or stuff like try_div or try_mul, etc., but I'm just thinking out loud.

cmpute commented 1 year ago

Thanks for raising this up. I will think about adding the NaN cases to the documentation (as I'm writing a guide to be published with the release v0.4). Currently as not many operations are implemented for the FBig type, the potential sources of NaNs are:

Common approach to make user aware of NaNs includes:

I think providing a good documentation and let the user handle corner cases is the way to go.

Shoeboxam commented 6 months ago

Another source of panics are overflow/underflow. For example, the following, which underflow panics instead of rounding down:

FBig::<Down>::try_from(-6871947604880523000.)?.exp()
Shoeboxam commented 6 months ago

It's also not possible to anticipate settings where the calculation might overflow. Take for example, the following, which panics:

let lhs = f64::MAX;
let rhs = f64::MIN_POSITIVE;
FBig::<Up>::try_from(lhs)? + FBig::<Up>::try_from(rhs)?

However, the naive check you might run beforehand doesn't anticipate the issue:

if (lhs + rhs).is_finite() {
    // dashu code here still runs and panics
    return Ok(FBig::<Up>::try_from(lhs)? + FBig::<Up>::try_from(rhs)?)
}

For my own use, I've wrapped the dashu API with catch unwind and panic handlers.

cmpute commented 6 months ago

Well the underflowing/overflowing cases are indeed hard to catch them all beforehand, I'm open to provide a fallible API for this case. My main concern is to add unwrap for every operations, even for the built-in ops.

I have several viable options in mind, although none of them is perfect:

  1. Let every operation return a Result: this leads to adding unwrap everywhere
  2. Let every operation return a special struct (e.g. FpResult), this struct should implement the same API as the normal floats, and it should carry all the error information including NaN/underflow/overflow/etc: this leads to a lot of redundant code and extra maintenance cost.
  3. Only let the ops implemented by the Context object return a Result: the ops implemented by Context has some overhead compared to normal float ops, because it has to do the precision check before each operation. However this overhead is pretty small when base 2 (or a power of 2) is used.

Please let me know which approach looks best to you, or if you have better ideas! Thanks! @kameko @Shoeboxam

cmpute commented 6 months ago

My favorite over the three is the last one, and when implemented as methods of FBig, the unusual output should be converted to corresponding value or panic: overflow -> inf, underflow -> 0, nan -> panic

Shoeboxam commented 6 months ago

All of these options strike me as viable solutions. Comments on each approach, respectively:

  1. I don't have any issue with handling results, because all my call sites already return results. Just impl From for your error type into my error type, and sprinkle in the try operator ?.
  2. I'm using dashu as a replacement for rug/GMP/MPFR, and they carry extra bits around to represent nan and infinities. I think this results in the best user API, although it sounds like it would be a greater burden on you.
  3. Do you think this might result in more work for you, as a maintainer, because it would double the number of public APIs to document/maintain?

I think all three approaches are somewhat similar- instead of pushing infinity/nan into a panic, you are packing it into a struct- either within a Result or in an FpResult.

Which approach is better depends on what you want your library to support: Right now the library doesn't allow nan/inf inputs or outputs. If you want the library to support only nan/inf outputs, then I'd pack those states into a result. If you want the library to support nan/inf inputs and outputs, like MPFR, then those states should be in your data type.

cmpute commented 6 months ago

Regarding nan/inf, I don't plan to support them as input. Operations on nan/inf are mathematically ill-defined, and they exists in programming because the precision is limited in computers. In my opinion, they are mostly useless when you want to do arbitrary precision operations. It's better to stick to f32/f64 when you find that you encounter nan/inf pretty often.

cmpute commented 6 months ago

Do you think this might result in more work for you, as a maintainer, because it would double the number of public APIs to document/maintain?

Making the ops associated with Context returning different types is okay, because they already differ with operations between FBig instances. It won't require much additional effort.