Open tannergooding opened 1 year ago
Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.
Author: | tannergooding |
---|---|
Assignees: | - |
Labels: | `api-suggestion`, `area-System.Numerics` |
Milestone: | 9.0.0 |
This would be a replacement for https://github.com/dotnet/csharplang/issues/7599, which was rejected by the language in https://github.com/dotnet/csharplang/blob/main/meetings/2023/LDM-2023-10-16.md#new-operator--for-canonical-modulus-operations
Can #65408 be closed as a duplicate with this?
There was mention of floating-point support and whether/when it would be added. I for one would definitely make use of it!
Ideally, this would be on INumber
, since that's where IModulusOperators
is provided. The issue is that it's not possible to implement as a static virtual
for all INumbers
with a default implementation (with proper rounding). It would be better if it was split into a separate interface, which was then implemented by default by IBinaryInteger
, and IFloatingPoint
(I would think it should be possible to implement correctly here, since it could be done similar to Math.IEEERemainder
). Ideally all the integral types, floating point types, and decimal
would implement this interface.
When the operation is needed, it could be used like so, without losing the ability to overload by other types which provide it (like floats):
static T SomeCalculation<T>(T a, T b) where T : INumber<T>, IRoundableDivisionFunctions<T>
{
return T.Divide(T.CreateChecked(2) * a, b, DivisionRounding.Floor);
}
The changes should be updated like so imo:
namespace System.Numerics;
public enum DivisionRounding
{
Truncate = 0, // Towards Zero
Floor = 1, // Towards -Infinity
Ceiling = 2, // Towards +Infinity
AwayFromZero = 3, // Away from Zero
Euclidean = 4, // floor(x / abs(n)) * sign(n)
}
public partial interface IRoundableDivisionFunctions<TSelf> where TSelf : IRoundableDivisionFunctions<TSelf>
{
static abstract TSelf Divide(TSelf left, TSelf right, DivisionRounding mode);
static abstract TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode);
static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode);
}
public partial interface IBinaryInteger<TSelf>
: IRoundableDivisionFunctions<TSelf> // added
{
// Existing:
static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right);
// Proposed: (default implementations)
static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
public partial interface IFloatingPoint<TSelf>
: IRoundableDivisionFunctions<TSelf> // added
{
// Proposed: (default implementations)
static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
//and also provide a default implementation in IFloatingPointIeee754, IBinaryFloatingPointIeee754 - this allows us to optimise those cases in the future without breaking changes if we add them now
public partial interface IFloatingPointIeee754<TSelf>
: IRoundableDivisionFunctions<TSelf> // added
{
// Proposed: (default implementations)
static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
public partial interface IBinaryFloatingPointIeee754<TSelf>
: IRoundableDivisionFunctions<TSelf> // added
{
// Proposed: (default implementations)
static TSelf IRoundableDivisionFunctions<TSelf>.Divide(TSelf left, TSelf right, DivisionRounding mode);
static (TSelf Quotient, TSelf Remainder) IRoundableDivisionFunctions<TSelf>.DivRem(TSelf left, TSelf right, DivisionRounding mode);
static TSelf IRoundableDivisionFunctions<TSelf>.Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
We should implement & expose public APIs it on all the integral types: Byte
, SByte
, Int16
, UInt16
, Int32
, UInt32
, IntPtr
, UIntPtr
, Int64
, UInt64
, Int128
, UInt128
; all floating point types Half
, Single
, NFloat
, Double
; and Decimal
and BigInteger
. (Am I missing any others?) (It may actually be possible to implement for Complex
also, since the operation would be clear with the DivisionRounding
mode set)
Also, should this be called just Divide
, that may be a bit unclear for floats, but the DivisionRounding
parameter may make it obvious enough?
Also, should we have IEEERounding as a division rounding mode? I don't really see why not, assuming it's useful to some people.
Ideally, this would be on INumber, since that's where IModulusOperators is provided.
Providing DivRem
on general numbers is difficult due to how it would have to behave on floating-point types. That is, it doesn't perform the normal division operator, but rather a rounded division operation. It then can't provide a DIM implementation, as indicated.
It would be better if it was split into a separate interface
Providing a separate interface here is likewise a bit problematic. It likely couldn't provided DivRem
due to the issue mentioned previously with floating-point. Rather, it could only provide Divide
and Remainder
operations.
The DIMs on floating-point are themselves somewhat problematic due to double rounding that can occur (leading to incorrect results) and the implication of what a rounding mode on a division operation means in that context. -- Rounding on floating-point operations may be believed to change it from "to nearest, ties to even" to the indicated mode instead and not to be returning an integral result rounded in that direction.
I think this is one of the cases where providing a uniform API surface between integral and floating-point isn't possible and that floating-point needs to be considered separately.
I'm marking this as ready-for-review
as the floating-point questions and even whether an interface exists won't be blocked by this. We can minimally discuss it as part of API review and we can separately continue the discussion on how to appropriately expose this functionality for floating-point in a way that meaningfully works.
Doesn't using an enumeration make this less trim friendly? As opposed to something akin to DivRem_Truncate
.
+1 to Aarons comment. Ultimately the implementation of the DivRem
will have a switch somewhere in it, and implement 5 different ways based on the enum value. But if I only use DivRem(.., .., Truncate)
in my app, the trimmer will still have to keep all the other 4 algorithms as well, since it can't figure out how to remove cases from that switch. Given that it seems to be expected that some of the division modes are rarely used, ideally we would design the API in such a way those could be trimmed away even if the app does use the DivRem
on one of the rounding modes.
This can be achieved by either a new method for each mode, or a new interface for each mode.
Doesn't using an enumeration make this less trim friendly?
It should realistically depend on how it's implemented.
At the end of the day, .NET uses enums for many reasons/scenarios, including cases like this. We should not be forcing our APIs into less flexible or less ideal solutions to accommodate the tooling and should rather be updating the tooling to reasonably accommodate regular .NET APIs.
Notably, in this case, coming up with n
different public API names, that are logically consistent and make sense is difficult. We don't use underscores and won't start here. It also makes it incredibly difficult for users who need to pass down dynamic modes, support source generation, and other considerations to make the right thing happen.
There may still be a good middle ground and we might find n
different public names that fit the bill here. But, the trimmer should itself really be able to do "basic things" like see DivRem(x, y, DivisionRounding.Name)
and then see the implementation is switch(mode) { case Name: InternalMethodFoo(x, y); }
and trim it down to calling InternalMethodFoo(x, y)
directly. So that it isn't negatively impacting size in the common case.
At the end of the day, .NET uses enums for many reasons/scenarios, including cases like this.
This is true for heavy use and/or common APIs. The proposed API is arguably not going to be common or see a lot of heavy use. and will not have anywhere near the visibility of others where the priority is UX.
We should not be forcing our APIs into less flexible or less ideal solutions to accommodate the tooling and should rather be updating the tooling to reasonably accommodate regular .NET APIs.
This happens a lot, especially in lower level or uncommon APIs. The numerics works has introduced a lot of added pressure on trimmer scenarios and where possible it seems prudent to understand the trades off and have the discussion.
Yes, and I'm not saying we shouldn't have that discussion.
I'm simply saying that the overall view of "what can the trimmer currently handle" should not be the main impact of how we design APIs. If a reasonably smart trimmer could handle the scenario and that scenario is itself reasonably common in existing .NET code; then we should ensure a relevant issue exists to track that being handled so that it gets handled. We should then do what the best thing for the API is.
We should only be taking limitations of the tooling into account for scenarios that can never be reasonably handled or which go against our existing common design practices. I would, in no way, view simplistic inlining and dead code elimination based on a constant input to be "unreasonable", particularly when the main job of a trimmer is in fact "dead code elimination".
The proposed API is arguably not going to be common or see a lot of heavy use. and will not have anywhere near the visibility of others where the priority is UX.
That's essentially impossible to determine until after the API has shipped. This feature has, for several years now, been one of the more requested scenarios in dotnet/csharplang and which has garnered repeated interest in dotnet/runtime as well.
Across the various issues that have led up to this proposal, there has been a significantly higher than average number of upvotes and interest, showing that it is clearly desired and wanted by end users; particularly remainder of floored division
(commonly called mod
).
There will be a non-trivial number of users that adopt this as soon as its available. There will then be many other places that are able to simplify their existing hand coded algorithms (in the BCL, in ML.NET, and in other repos in the broader community) where the other common division/remainder operations are also used.
int
namespace System.Numerics;
public enum DivisionRounding
{
Truncate = 0,
Floor = 1,
Ceiling = 2,
AwayFromZero = 3,
Euclidean = 4,
}
public partial interface IBinaryInteger<T>
{
// Existing:
// static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right);
static virtual TSelf Divide(TSelf left, TSelf right, DivisionRounding mode);
static virtual (TSelf Quotient, TSelf Remainder) DivRem(TSelf left, TSelf right, DivisionRounding mode);
static virtual TSelf Remainder(TSelf left, TSelf right, DivisionRounding mode);
}
Noting that we did discuss the point on trimming support and the pros/cons.
The overall minimal size of the APIs in question, the cost of exposing 18 new public methods per primitive type, and that we have prior art in the Math APIs for taking rounding mode as an enum won out.
Background and Motivation
Most computer hardware implements division as a
truncated division
. Correspondingly, many languages implement their division (x / y
) and remainder (x % y
) support following this. However, some languages default tofloored division
and have their remainder be the counterpart to this instead. This ultimately has the same behavior astruncated
when the signs ofx
andy
match, but a different result when the signs do not. Some other languages still may use other types of rounding in their division operations which may includeEuclidean
,Ceiling
, orRounded
.There are ultimately benefits and drawbacks to each of the approaches and what is correct for one scenario may not be correct for all. Likewise, while computing any of them is fairly trivial, the most efficient implementation is often non-obvious. As such, it is proposed we expose a set of new APIs that allow developers to pick the one that is right for them.
Proposed API
-- These APIs would have DIMs and would be publicly implemented on the built-in types that implement
IBinaryInteger
, such asInt32
,Int64
, andByte
.Additional Considerations
The following gives an overview of the most common variations of the remainder (or modulo) operation: https://en.wikipedia.org/wiki/Modulo#Variants_of_the_definition
The following gives some examples of what is used for various languages: https://en.wikipedia.org/wiki/Modulo#In_programming_languages
Rust is experimenting with its own similar APIs via the
int_roundings
feature, which expands on some of the existing support likerem_euclid
: https://doc.rust-lang.org/std/primitive.i32.html#method.div_floor and https://github.com/rust-lang/rust/issues/88581It may be desirable to also support this functionality for floating-point, but it can be quite a bit more expensive to support there.
The general terminology in this space is incredibly overloaded. Several languages use competing terminology, but a general consistency is that
rem
is often referring to remainder oftruncated division
andmod
is referring to the remainder offloored division
. We may want to consider or incorporate this into the naming to help users identify the one they want.