dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.12k stars 4.7k forks source link

Provide public API for non-throwing division of Decimal #29058

Open znakeeye opened 5 years ago

znakeeye commented 5 years ago

Decimal division operator throws when the result overflows. Please provide a public API to allow safe division without non-performant throw/catch. E.g.:

var numerator = 100m;
var denumerator = 0m;

if (!numerator.TryDivide(denumerator))
{
    // Overflow or division by zero. Handle it!
}

Just for the record. In .NET Framework the division operator calls into some non-public native FCallDivideOverflowed API where this piece of information was also - unfortunately - hidden from the public APIs:

private static extern void FCallDivideOverflowed(
     ref Decimal d1, ref Decimal d2, ref bool overflowed);
MarcoRossignoli commented 5 years ago

/cc @tannergooding

Clockwork-Muse commented 5 years ago

Er, division by zero is (almost? always) going to be programmer error, and fairly trivial to detect before hand. Something you want your program to be screaming about, so it gets fixed.

Overflow is also (usually) going to be programmer error (although much more difficult to detect ahead of time in the general case); this is because the range of the domain is, in almost all cases, far smaller than the range of the space allocated.

tannergooding commented 5 years ago

CC. @pentp as he generally has good opinions around the System.Decimal type.

My initial thought, based on prior conversations, is that division by zero doesn't make sense for the System.Decimal type due to its existing use-cases.

It would be more appropriate for a decimal type (such as the IEEE decimal types) which support NaN, +/-Infinity and +/-Zero.

pentp commented 5 years ago

decimal division will overflow only if you try to divide a really big number by something less than 1 (effectively trying to multiply). What would be the use case where such divisions are part of the normal control flow?

znakeeye commented 5 years ago

You go for the decimal type when you need to accurately store and calculate on very large numbers. E.g. in banking or similar. In our 10000-user system, these overflows have proven to be a significant problem. Our calculation engine (run on a tree structure) crashes deep inside the used formulas and it's actually expected that the values sometimes overflow. In our case, there could be hundred thousand similar operations running in parallel.

These overflows usually happen after a number of operations within advanced formulas; i.e. addition, subtraction, multiplication and division. Add to that some currency conversion, manual input, and you soon realize that it's virtually impossible to handle this on the formula level. Hence, a try/catch.

We would raise a top-priority Microsoft issue if we believed this could be solved in .NET Framework, without going unsafe/native. Now we are looking at .NET Core as an alternative. Meanwhile, in order to avoid an extreme performance hit from the thrown exceptions, we came up with a hacky solution where we avoid dividing with "very low" values. Not very robust, but it keeps our system alive.

Please do consider exception-friendly decimal APIs.

Clockwork-Muse commented 5 years ago

Let's think about this a bit.

Consider that the range of decimal allows for 28 digits; the Gross World Product was ~$75.59 trillion, or 15 digits. Your "very small values" would need to have their most significant digit on the order of 13 places after the decimal point to overflow in such a situation.
I find it unlikely whatever your business is doing involves tracking more "wealth" than the planet generates in a year .

In most cases of currency conversion, at worst the most significant digit is going to be in the hundreds/thousands range (2 or 3 digits).
If you're handling the extreme case of, for example, older Zimbabwean dollars, the overflows you'd experience are NOT telling you "we should switch to non-throwing methods", but instead "we need to use a type with an increased range". Because you'd still need to correctly calculate and output the results, and overflowing wouldn't let you do so, regardless of whether it was an explicit Exception subclass or returning an overflow code. However, given that sort of edge case is extremely niche, I find it unlikely that the overflows from such data would represent a significant portion of exceptions (and hence should not represent a large portion of "valid" overflow exceptions).

If you accept manual input, you need to validate it before you use it in calculations. Numbers above a trillion or below a thousandth seem probably suspicious (although I'm not fully up on the domain).

</ hr> Note that, if there's a non-throwing version of a method, then expect somebody to use it without checking for success. Hopefully your code reviews would catch such a thing.

znakeeye commented 5 years ago

No, let's not think about this a bit. Instead, let's consider a real-life scenario and add to that some common sense.

The fact alone, that the native APIs provide the "overflow" state as a boolean, should be enough to motivate a similar managed API. Then, let's consider a real-life scenario (I expect at least two perfunctory thumbs-up now):

And no, we cannot round the value. Nor can we know if subsequent operations will overflow or not. Super low values are valid and should be valid. For most cases they will work, but some edge cases give us exceptions when we really don't need that. We are talking about extremely hot paths here. "Skeet" optimized hot paths.

Clockwork-Muse commented 5 years ago
  • A price is set to 1234.56789
  • User sets a discount of 1234.5678 to get a price of 0.
  • Oops. The result is 0.00009.
  • This value is then propagated through hundreds of additional calculations. This is where we currently need to catch overflows.
  1. 0.00009 isn't small enough to trigger overflows in any reasonable range (see previous calculation) during division. You'd have to be charging for individual atoms or something.
  2. If it was multiplied by a sufficiently large number to cause overflow, the prior value (1234.56789) would also cause overflow - and much sooner.
  3. Addition or division would have to push it so far away from the original value, 0.00009 isn't even going to be noise.

... what are your "additional calculations"? You'd have to be way outside the bounds of any reasonable domain.

And no, we cannot round the value.

When to round can often be mandated by government fiat, but I'll give you this for now.

Nor can we know if subsequent operations will overflow or not.

It can be difficult to tell, true.

Super low values are valid and should be valid.

Define "super low". 0.00009 isn't "super low" by any stretch of the imagination. It's reasonably in-range. I find it unlikely anything with more than 10 decimal places would be in-range in financial calculations.

For most cases they will work, but some edge cases give us exceptions when we really don't need that.

I reiterate that my assumption is that your system has a bug of some sort, and is valiantly trying to alert you to this. Psychic debugging suggests to me that perhaps you're trying to recursively multiply a discount percentage or tax rate, which are unlikely to be valid below hundredths.

MgSam commented 5 years ago

@Clockwork-Muse I always find it strange when people report real-life problems they're encountering and then someone else comes along and essentially tells them they're wrong without having any real knowledge of the system in question. You're fooling yourself if you believe you're so smart you know this person's business domain better than he knows it himself.

As a trivial example, take your example of $75.59 trillion and represent that in Zimbabwe dollars. (Several years ago 100 trillion Zimbabwe dollars were worth $.40 US cents). Try using that exchange rate with decimal and see what happens.

I work with an application where users define formulas themselves. They can write literally whatever they want, and we then need to run said calculations in a high performance scenario. All values are in decimal. Anything that we can do to prevent exceptions for the arbitrary formulas being processed is useful.

Though even more critically, the functionality being asked for is already exposed in the native function. You haven't made a single argument why it exposing it in .NET would somehow be detrimental.

pentp commented 5 years ago

I'm not convinced that these real-world scenarios are actually very realistic, but lets say there was a flag instead of an exception - what would you do then with that flag (bear in mind, that the decimal value itself is garbage at that point)?

znakeeye commented 5 years ago

Please come and join our project (seriously). You'll be amazed how real these scenarios are.

If an overflow occurs, we would clamp values in either direction and likely maintain a state that calculations may be inaccurate.

The fact that the boolean flag was removed for performance reasons certainly adds value to the discussion! For sure, we don't want a flag if it would reduce performance in the standard case.

Clockwork-Muse commented 5 years ago

If an overflow occurs, we would clamp values in either direction and likely maintain a state that calculations may be inaccurate.

likely... "may"!? The moment you have an overflow, the calculations are inaccurate, period.

@znakeeye - Regardless of the possible merit of the idea - and @MgSam is right that I've been speaking out of ignorance - is it seems that you're trying to optimize for a common case (because that's what one does). That is, that exceptions are commonly thrown, and that your data runs are ending in error. If a significant enough portion of the data moving through a system I maintained was ending in error, I'd be trying to stop the errors, not necessarily make the errors more efficient. And though this data is old, and there's potentially problems with the methodology, exceptions aren't necessarily that slow in the first place.

I agree that being able to tell whether a subsequent calculation will overflow is difficult. But getting there requires values far outside the normal range for financial calculations. From what I imagine from your description so far, it's unlikely that you should be anywhere close to them. What's the specific reason that the "hundreds of calculations" has for getting close to the limits? If exceptions are common, it feels like some individual calculation may not be properly spec'd. (Hopefully you have a way to correlate the exception with the calculation chain)

@MgSam -

As a trivial example, take your example of $75.59 trillion and represent that in Zimbabwe dollars. (Several years ago 100 trillion Zimbabwe dollars were worth $.40 US cents). Try using that exchange rate with decimal and see what happens.

....An exception seems entirely appropriate? Either the system is spec'd to handle numbers greater than the range of decimal (via some equivalent to Java's BigDecimal), or you're into an error flow and performance isn't (as big) a concern.

I work with an application where users define formulas themselves. They can write literally whatever they want, and we then need to run said calculations in a high performance scenario. All values are in decimal. Anything that we can do to prevent exceptions for the arbitrary formulas being processed is useful.

...it seems like you're focused on the wrong thing - preventing exceptions via using a TryXXX flow. If your users are mostly trustable, this seems like an unlikely need: since an exception means their calculations ended in error, users will naturally work to stop the errors in the first place. If your users are mostly untrusted, there are far easier ways to DoS the system, if that's what you're worried about (like, are infinite loops possible?). The try .... catch blocks themselves should not represent a significant performance cost, so it feels like you're trying to optimize away the threat of the exception. Which I would attempt to do by things like feeding in test cases, like all those online coding test sites you see.

pentp commented 5 years ago

If an overflow occurs, we would clamp values in either direction and likely maintain a state that calculations may be inaccurate.

It doesn't seem that useful to continue any calculations from that point on (the result is meaningless), but regardless, I'd propose something like this: if an overflow exception occurs, re-run the calculation through a slower code path that tracks input parameters (or just their sign) for each decimal add/sub/mul/div operation and then deduce the clamped value from the XOR of signs (either decimal.MinValue or decimal.MaxValue). Or even better - the slower code path could just switch to an alternative implementation (e.g., something based on BigInteger) and get some actual results.

This way you don't hurt your common case performance and can still handle the overflow cases as well. And this would work today on any .NET version.