dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.37k stars 4.75k forks source link

Provide support for alternative rounding modes for division and remainder of division. #93568

Open tannergooding opened 1 year ago

tannergooding commented 1 year ago

Background and Motivation

Most computer hardware implements division as a truncated division. Correspondingly, many languages implement their division (x / y) and remainder (x % y) support following this. However, some languages default to floored division and have their remainder be the counterpart to this instead. This ultimately has the same behavior as truncated when the signs of x and y match, but a different result when the signs do not. Some other languages still may use other types of rounding in their division operations which may include Euclidean, Ceiling, or Rounded.

There are ultimately benefits and drawbacks to each of the approaches and what is correct for one scenario may not be correct for all. Likewise, while computing any of them is fairly trivial, the most efficient implementation is often non-obvious. As such, it is proposed we expose a set of new APIs that allow developers to pick the one that is right for them.

Proposed API

namespace System.Numerics;

public enum DivisionRounding
{
    Truncate = 0,        // Towards Zero
    Floor = 1,           // Towards -Infinity
    Ceiling = 2,         // Towards +Infinity
    AwayFromZero = 3,    // Away from Zero
    Euclidean = 4,       // floor(x / abs(n)) * sign(n)
}

public partial interface IBinaryInteger<T>
{
    // Existing:
    static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right);

    // Proposed:
    static virtual TSelf Divide(TSelf left, TSelf right, DivisionRounding mode);
    static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static virtual TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode);
}

-- These APIs would have DIMs and would be publicly implemented on the built-in types that implement IBinaryInteger, such as Int32, Int64, and Byte.

Additional Considerations

The following gives an overview of the most common variations of the remainder (or modulo) operation: https://en.wikipedia.org/wiki/Modulo#Variants_of_the_definition

The following gives some examples of what is used for various languages: https://en.wikipedia.org/wiki/Modulo#In_programming_languages

Rust is experimenting with its own similar APIs via the int_roundings feature, which expands on some of the existing support like rem_euclid: https://doc.rust-lang.org/std/primitive.i32.html#method.div_floor and https://github.com/rust-lang/rust/issues/88581

It may be desirable to also support this functionality for floating-point, but it can be quite a bit more expensive to support there.

The general terminology in this space is incredibly overloaded. Several languages use competing terminology, but a general consistency is that rem is often referring to remainder of truncated division and mod is referring to the remainder of floored division. We may want to consider or incorporate this into the naming to help users identify the one they want.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details
## Background and Motivation Most computer hardware implements division as a `truncated division`. Correspondingly, many languages implement their division (`x / y`) and remainder (`x % y`) support following this. However, some languages default to `floored division` and have their remainder be the counterpart to this instead. This ultimately has the same behavior as `truncated` when the signs of `x` and `y` match, but a different result when the signs do not. Some other languages still may use other types of rounding in their division operations which may include `Euclidean`, `Ceiling`, or `Rounded`. There are ultimately benefits and drawbacks to each of the approaches and what is correct for one scenario may not be correct for all. Likewise, while computing any of them is fairly trivial, the most efficient implementation is often non-obvious. As such, it is proposed we expose a set of new APIs that allow developers to pick the one that is right for them. ## Proposed API ```csharp namespace System.Numerics; public enum DivisionRounding { Truncate = 0, // Towards Zero Floor = 1, // Towards -Infinity Ceiling = 2, // Towards +Infinity AwayFromZero = 3, // Away from Zero Euclidean = 4, // floor(x / abs(n)) * sign(n) } public partial interface IBinaryInteger { // Existing: static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right); // Proposed: static virtual TSelf Divide(TSelf left, TSelf right, DivisionRounding mode); static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode); static virtual TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode); } ``` -- These APIs would have DIMs and would be publicly implemented on the built-in types that implement `IBinaryInteger`, such as `Int32`, `Int64`, and `Byte`. ## Additional Considerations The following gives an overview of the most common variations of the remainder (or modulo) operation: https://en.wikipedia.org/wiki/Modulo#Variants_of_the_definition The following gives some examples of what is used for various languages: https://en.wikipedia.org/wiki/Modulo#In_programming_languages Rust is experimenting with its own similar APIs via the `int_roundings` feature, which expands on some of the existing support like `rem_euclid`: https://doc.rust-lang.org/std/primitive.i32.html#method.div_floor and https://github.com/rust-lang/rust/issues/88581 It may be desirable to also support this functionality for floating-point, but it can be quite a bit more expensive to support there. The general terminology in this space is incredibly overloaded. Several languages use competing terminology, but a general consistency is that `rem` is often referring to remainder of `truncated division` and `mod` is referring to the remainder of `floored division`. We may want to consider or incorporate this into the naming to help users identify the one they want.
Author: tannergooding
Assignees: -
Labels: `api-suggestion`, `area-System.Numerics`
Milestone: 9.0.0
tannergooding commented 1 year ago

This would be a replacement for https://github.com/dotnet/csharplang/issues/7599, which was rejected by the language in https://github.com/dotnet/csharplang/blob/main/meetings/2023/LDM-2023-10-16.md#new-operator--for-canonical-modulus-operations

MichalPetryka commented 1 year ago

Can #65408 be closed as a duplicate with this?

rickbrew commented 1 year ago

There was mention of floating-point support and whether/when it would be added. I for one would definitely make use of it!

hamarb123 commented 1 year ago

Ideally, this would be on INumber, since that's where IModulusOperators is provided. The issue is that it's not possible to implement as a static virtual for all INumbers with a default implementation (with proper rounding). It would be better if it was split into a separate interface, which was then implemented by default by IBinaryInteger, and IFloatingPoint (I would think it should be possible to implement correctly here, since it could be done similar to Math.IEEERemainder). Ideally all the integral types, floating point types, and decimal would implement this interface.

When the operation is needed, it could be used like so, without losing the ability to overload by other types which provide it (like floats):

static T SomeCalculation<T>(T a, T b) where T : INumber<T>, IRoundableDivisionFunctions<T>
{
    return T.Divide(T.CreateChecked(2) * a, b, DivisionRounding.Floor);
}

The changes should be updated like so imo:

namespace System.Numerics;

public enum DivisionRounding
{
    Truncate = 0,        // Towards Zero
    Floor = 1,           // Towards -Infinity
    Ceiling = 2,         // Towards +Infinity
    AwayFromZero = 3,    // Away from Zero
    Euclidean = 4,       // floor(x / abs(n)) * sign(n)
}

public partial interface IRoundableDivisionFunctions<TSelf> where TSelf : IRoundableDivisionFunctions<TSelf>
{
    static abstract TSelf Divide(TSelf left, TSelf right, DivisionRounding mode);
    static abstract TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode);
    static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode);
}

public partial interface IBinaryInteger<TSelf>
    : IRoundableDivisionFunctions<TSelf> // added
{
    // Existing:
    static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right);

    // Proposed: (default implementations)
    static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
    static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}

public partial interface IFloatingPoint<TSelf>
    : IRoundableDivisionFunctions<TSelf> // added
{
    // Proposed: (default implementations)
    static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
    static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}

//and also provide a default implementation in IFloatingPointIeee754, IBinaryFloatingPointIeee754 - this allows us to optimise those cases in the future without breaking changes if we add them now
public partial interface IFloatingPointIeee754<TSelf>
    : IRoundableDivisionFunctions<TSelf> // added
{
    // Proposed: (default implementations)
    static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
    static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
public partial interface IBinaryFloatingPointIeee754<TSelf>
    : IRoundableDivisionFunctions<TSelf> // added
{
    // Proposed: (default implementations)
    static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
    static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}

We should implement & expose public APIs it on all the integral types: Byte, SByte, Int16, UInt16, Int32, UInt32, IntPtr, UIntPtr, Int64, UInt64, Int128, UInt128; all floating point types Half, Single, NFloat, Double; and Decimal and BigInteger. (Am I missing any others?) (It may actually be possible to implement for Complex also, since the operation would be clear with the DivisionRounding mode set)

Also, should this be called just Divide, that may be a bit unclear for floats, but the DivisionRounding parameter may make it obvious enough?

Also, should we have IEEERounding as a division rounding mode? I don't really see why not, assuming it's useful to some people.

tannergooding commented 1 year ago

Ideally, this would be on INumber, since that's where IModulusOperators is provided.

Providing DivRem on general numbers is difficult due to how it would have to behave on floating-point types. That is, it doesn't perform the normal division operator, but rather a rounded division operation. It then can't provide a DIM implementation, as indicated.

It would be better if it was split into a separate interface

Providing a separate interface here is likewise a bit problematic. It likely couldn't provided DivRem due to the issue mentioned previously with floating-point. Rather, it could only provide Divide and Remainder operations.

The DIMs on floating-point are themselves somewhat problematic due to double rounding that can occur (leading to incorrect results) and the implication of what a rounding mode on a division operation means in that context. -- Rounding on floating-point operations may be believed to change it from "to nearest, ties to even" to the indicated mode instead and not to be returning an integral result rounded in that direction.

I think this is one of the cases where providing a uniform API surface between integral and floating-point isn't possible and that floating-point needs to be considered separately.

tannergooding commented 1 year ago

I'm marking this as ready-for-review as the floating-point questions and even whether an interface exists won't be blocked by this. We can minimally discuss it as part of API review and we can separately continue the discussion on how to appropriately expose this functionality for floating-point in a way that meaningfully works.

AaronRobinsonMSFT commented 1 year ago

Doesn't using an enumeration make this less trim friendly? As opposed to something akin to DivRem_Truncate.

vitek-karas commented 1 year ago

+1 to Aarons comment. Ultimately the implementation of the DivRem will have a switch somewhere in it, and implement 5 different ways based on the enum value. But if I only use DivRem(.., .., Truncate) in my app, the trimmer will still have to keep all the other 4 algorithms as well, since it can't figure out how to remove cases from that switch. Given that it seems to be expected that some of the division modes are rarely used, ideally we would design the API in such a way those could be trimmed away even if the app does use the DivRem on one of the rounding modes.

This can be achieved by either a new method for each mode, or a new interface for each mode.

tannergooding commented 1 year ago

Doesn't using an enumeration make this less trim friendly?

It should realistically depend on how it's implemented.

At the end of the day, .NET uses enums for many reasons/scenarios, including cases like this. We should not be forcing our APIs into less flexible or less ideal solutions to accommodate the tooling and should rather be updating the tooling to reasonably accommodate regular .NET APIs.

Notably, in this case, coming up with n different public API names, that are logically consistent and make sense is difficult. We don't use underscores and won't start here. It also makes it incredibly difficult for users who need to pass down dynamic modes, support source generation, and other considerations to make the right thing happen.

There may still be a good middle ground and we might find n different public names that fit the bill here. But, the trimmer should itself really be able to do "basic things" like see DivRem(x, y, DivisionRounding.Name) and then see the implementation is switch(mode) { case Name: InternalMethodFoo(x, y); } and trim it down to calling InternalMethodFoo(x, y) directly. So that it isn't negatively impacting size in the common case.

AaronRobinsonMSFT commented 1 year ago

At the end of the day, .NET uses enums for many reasons/scenarios, including cases like this.

This is true for heavy use and/or common APIs. The proposed API is arguably not going to be common or see a lot of heavy use. and will not have anywhere near the visibility of others where the priority is UX.

We should not be forcing our APIs into less flexible or less ideal solutions to accommodate the tooling and should rather be updating the tooling to reasonably accommodate regular .NET APIs.

This happens a lot, especially in lower level or uncommon APIs. The numerics works has introduced a lot of added pressure on trimmer scenarios and where possible it seems prudent to understand the trades off and have the discussion.

tannergooding commented 1 year ago

Yes, and I'm not saying we shouldn't have that discussion.

I'm simply saying that the overall view of "what can the trimmer currently handle" should not be the main impact of how we design APIs. If a reasonably smart trimmer could handle the scenario and that scenario is itself reasonably common in existing .NET code; then we should ensure a relevant issue exists to track that being handled so that it gets handled. We should then do what the best thing for the API is.

We should only be taking limitations of the tooling into account for scenarios that can never be reasonably handled or which go against our existing common design practices. I would, in no way, view simplistic inlining and dead code elimination based on a constant input to be "unreasonable", particularly when the main job of a trimmer is in fact "dead code elimination".

The proposed API is arguably not going to be common or see a lot of heavy use. and will not have anywhere near the visibility of others where the priority is UX.

That's essentially impossible to determine until after the API has shipped. This feature has, for several years now, been one of the more requested scenarios in dotnet/csharplang and which has garnered repeated interest in dotnet/runtime as well.

Across the various issues that have led up to this proposal, there has been a significantly higher than average number of upvotes and interest, showing that it is clearly desired and wanted by end users; particularly remainder of floored division (commonly called mod).

There will be a non-trivial number of users that adopt this as soon as its available. There will then be many other places that are able to simplify their existing hand coded algorithms (in the BCL, in ML.NET, and in other repos in the broader community) where the other common division/remainder operations are also used.

terrajobst commented 1 year ago

Video

namespace System.Numerics;

public enum DivisionRounding
{
    Truncate = 0,
    Floor = 1,
    Ceiling = 2,
    AwayFromZero = 3,
    Euclidean = 4,
}

public partial interface IBinaryInteger<T>
{
    // Existing:
    // static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right);

    static virtual TSelf Divide(TSelf left, TSelf right, DivisionRounding mode);
    static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode);
    static virtual TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
tannergooding commented 1 year ago

Noting that we did discuss the point on trimming support and the pros/cons.

The overall minimal size of the APIs in question, the cost of exposing 18 new public methods per primitive type, and that we have prior art in the Math APIs for taking rounding mode as an enum won out.