Investigate audit potential/result in double remainder (REM) operation.

dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.

https://docs.microsoft.com/dotnet/core/

MIT License

15.17k stars 4.72k forks source link

Investigate audit potential/result in double remainder (REM) operation. #62992

Open CyrusNajmabadi opened 2 years ago

CyrusNajmabadi commented 2 years ago

Description

When given two doubles x and y and the C# operation x % y, C# (roslyn) will push the values to the stack and emit the REM instruction. C# defines this as:

z is the result of x % y and is computed as x - n * y, where n is the largest possible integer that is less than or equal to x / y. This method of computing the remainder is analogous to that used for integer operands, but differs from the IEEE 754 definition (in which n is the integer closest to x / y).

And the runtime is similar with:

result = value1 - value2 × (value1 div value2) ...

Talking to @tannergooding, we may want to audit this to ensure the impl we have is either following this (and fix if not), or doc better to explain what's going on. For example, if x=1.0 and y=0.1 then the instructions above allow for the following interpretation:

z is the result of x % y and is computed as x - n * y

so 1.0 % 0.1 is computed as 1.0 - n * 0.1

where n is the largest possible integer that is less than or equal to x / y

the largest possible integer that is less than or equal to 1.0 / 0.1 is 10.0, which can be validated here:

Console.WriteLine(10.0 <= (1.0 / 0.1))

As such, one interpretation is 1.0 - 10.0 * 0.1 which itself is evaluated to 0.0. This does not match current runtime result of 0.09999999999999945. Which likely means the algorithm is more like:

y = Math.Abs(y);
var result = Math.IEEERemainder(Math.Abs(x), y);
if (double.IsNegative(result))
{
    result += y;
}
return Math.CopySign(result, x);

Which can also be considered fine. However, if so, the docs should likely call this out so that the discrepancy can be better understood.

Reproduction Steps

Console.WriteLine(1.0 % 0.1)

Expected behavior

Less clear. Based strictly on the provided docs, either 0 or 0.09999999999999995. This ambiguity isn't great though and we should beef up the docs.

Actual behavior

0.09999999999999995. If desired, we should doc more precisely.

Regression?

No response

Known Workarounds

No response

Configuration

No response

Other information

No response

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.

Issue Details

### Description When given two doubles `x` and `y` and the C# operation `x % y`, C# (roslyn) will push the values to the stack and emit the `REM` instruction. [C# defines this as](https://github.com/dotnet/csharplang/blob/main/spec/expressions.md#remainder-operator): > z is the result of x % y and is computed as x - n * y, where n is the largest possible integer that is less than or equal to x / y. This method of computing the remainder is analogous to that used for integer operands, but differs from the IEEE 754 definition (in which n is the integer closest to x / y). And the [runtime is similar with](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.emit.opcodes.rem?redirectedfrom=MSDN&view=net-6.0): > result = value1 - value2 × (value1 div value2) ... Talking to @tannergooding, we may want to audit this to ensure the impl we have is either following this (and fix if not), or doc better to explain what's going on. For example, if `x=1.0` and `y=0.1` then the instructions above allow for the following interpretation: > z is the result of x % y and is computed as x - n * y so `1.0 % 0.1` is computed as `1.0 - n * 0.1` > where n is the largest possible integer that is less than or equal to x / y the largest possible integer that is less than or equal to `1.0 / 0.1` is `10.0`, which can be validated here: `Console.WriteLine(10.0 <= (1.0 / 0.1))` As such, one interpretation is `1.0 - 10.0 * 0.1` which itself is evaluated to `0.0`. This does not match current runtime result of `0.09999999999999945`. Which likely means the algorithm is more like: ```c# y = Math.Abs(y); var result = Math.IEEERemainder(Math.Abs(x), y); if (double.IsNegative(result)) { result += y; } return Math.CopySign(result, x); ``` Which can also be considered fine. However, if so, the docs should likely call this out so that the discrepancy can be better understood. ### Reproduction Steps Console.WriteLine(1.0 % 0.1) ### Expected behavior Less clear. Based strictly on the provided docs, either 0 or 0.09999999999999995. This ambiguity isn't great though and we should beef up the docs. ### Actual behavior 0.09999999999999995. If desired, we should doc more precisely. ### Regression? _No response_ ### Known Workarounds _No response_ ### Configuration _No response_ ### Other information _No response_

Author:	CyrusNajmabadi
Assignees:	tannergooding
Labels:	`area-System.Numerics`, `untriaged`
Milestone:	-

tannergooding commented 2 years ago

While its still fresh in my head. We internally implement this as basically over fmod, which the C spec states is:

The fmod functions return the value x − ny, for some integer n such that, if y is nonzero, the result has the same sign as x and magnitude less than the magnitude of y. If y is zero, whether a domain error occurs or the fmod functions return zero is implementation-defined.

It additionally calls out that this is functionally similar to:

y = Math.Abs(y);
var result = Math.IEEERemainder(Math.Abs(x), y);
if (double.IsNegative(result))
{
    result += y;
}
return Math.CopySign(result, x);

If you use that algorithm for 1.0 % 0.1 you do indeed get 0.09999999999999945.

tannergooding commented 2 years ago

Edit: The below analysis is potentially incorrect and is potentially an error on my part.

Namely the C# spec is ambiguous as to whether the following definition implies that floor(x / y) is a fused operation or not:

In the table, x and y are positive finite values. z is the result of x % y and is computed as x – n * y, where n is the largest possible integer that is less than or equal to x / y

I've added an additional note on this elaborating more further down.

This is ultimately a bug in our implementation since we define x % y to effectively be similar to:

var n = Math.Truncate(x / y);
return x - n * y;

We can't simply do the above algorithm as it may result in issues with regard to rounding since x - n * y is two operations. However, we could implement it as the following:

var n = Math.Truncate(x / y);
return Math.FusedMultiplyAdd(-n, y, x);

On ARM64 and x86/x64 from ~2013+ this will be effectively three instructions. For x86/x64 this is (noting we should update Math.Truncate to be intrinsic as part of this):

vdivsd xmm0, xmm1, xmm2
vroundsd xmm0, xmm0, xmm0, 3
vfnmadd213sd xmm0, xmm2, xmm1

For ARM32 we don't currently accelerate FusedMultiplyAdd, but we could and its available wherever AdvSimd is supported (which is baseline for ARM64).

For older x86/x64, this would end up being a call to the CRT fma function; which is what Math.FusedMultiplyAdd falls back to. Its known to be IEEE compliant and is likely (but I haven't measured yet) not significantly slower than the current fmod call which results in a fprem loop. The non-accelerated software implementation is typically something like: https://github.com/ucb-bar/berkeley-softfloat-3/blob/master/source/s_mulAddF64.c. MSVCRT and LibC each have their own implementations, but both are known to be IEEE compliant and aren't terribly slow.

tannergooding commented 2 years ago

There is a potentially interesting corner case to consider with the Truncate/FMA algorithm above which is that if the result of Math.Truncate(x / y) is a whole integer greater than 2^53, it may not be exactly representable (between 2^53 and 2^54 only even numbers are representable; and then every fourth number for 2^54 through 2^55, etc).

If x % y is meant to be treated as a "single operation", then technically we'd need to calculate the fully-precise integer result and use that in the subsequent x - n * y which would be significantly more complex to do (but is detectable and could keep the normal path "small").

CyrusNajmabadi commented 2 years ago

If you use that algorithm for 1.0 % 0.1 you do indeed get 0.09999999999999945.

Given that this is our behavior, shoudl we then update our docs to call this out? It feels like that if this is what we do, we should just doc that as preserving what we do feels more important than changing things just to satisfy docs that people couldn't have fully depended on anyways.

Clockwork-Muse commented 2 years ago

.... especially since we - and probably just about every other language out there - are wrapping fmod...

tannergooding commented 2 years ago

.... especially since we - and probably just about every other language out there - are wrapping fmod...

There are problems with wrapping fmod, including that the result is non-deterministic. C# is likely going to require this be deterministic and that means rolling their own implementation of fmod or moving to the long time spec'd implementation. Either change is going to be "breaking" in some fashion (either breaking the spec or implementation). The benefit of breaking the implementation, is that its already broken and non-deterministic.

tannergooding commented 2 years ago

@CyrusNajmabadi I've actually gone over this a bit more and there was an issue in what I stated above.

Namely the issue is that the C# spec has an ambiguity here. In particular it defines:

In the table, x and y are positive finite values. z is the result of x % y and is computed as x – n * y, where n is the largest possible integer that is less than or equal to x / y

The ambiguity comes from how to interpret:

where n is the largest possible integer that is less than or equal to x / y

In particular consider 1.0 % 0.1 again. 1.0 is exactly representable as a double while 0.1 is not and is in fact 0.1000000000000000055511151231257827021181583404541015625.

Now if you do this using the infinitely precise values represnted then 0.1 / 0.1000000000000000055511151231257827021181583404541015625 is 9.999999999999999444888487687421760603063276150361782076232622354...

For double, the nearest representable integer to this is 10 and so 1.0 / 0.1 == 10. Therefore if C# intends this to be:

var n = Math.Truncate(x / y);
return x - n * y

The correct answer is 0. However, this potentially introduces rounding error and other issues.

Where-as if it is intended that x % y matches the normal IEEE 754 requirements around operations which is that x op y is done using the exact represented values and computed as if to infinite precision and unbounded range, then rounded to the nearest representable result; then x % y has to do effectively a FloorDiv(x, y) which would instead round the infinitely precise answer (9.9999999999999994...) down to 9. fma(9, -0.1, 1) is then 0.0999999999999999500399638918679556809365749359130859375 as expected and as returned by fmod.

tannergooding commented 2 years ago

We should likely determine what the intent of the C# spec is here and I should finish doing the math to confirm that fmod(x, y) is equivalent to fma(fdiv(x, y), -y, x) (I believe it is).

If the intent is that % is equivalent to fmod, updating the spec wording would likely be desirable and would avoid a breaking change.