Open CyrusNajmabadi opened 2 years ago
Tagging subscribers to this area: @dotnet/area-system-numerics See info in area-owners.md if you want to be subscribed.
Author: | CyrusNajmabadi |
---|---|
Assignees: | tannergooding |
Labels: | `area-System.Numerics`, `untriaged` |
Milestone: | - |
While its still fresh in my head. We internally implement this as basically over fmod
, which the C spec states is:
The
fmod
functions return the valuex − ny
, for some integern
such that, ify
is nonzero, the result has the same sign asx
and magnitude less than the magnitude ofy
. Ify
is zero, whether a domain error occurs or thefmod
functions return zero is implementation-defined.
It additionally calls out that this is functionally similar to:
y = Math.Abs(y);
var result = Math.IEEERemainder(Math.Abs(x), y);
if (double.IsNegative(result))
{
result += y;
}
return Math.CopySign(result, x);
If you use that algorithm for 1.0 % 0.1
you do indeed get 0.09999999999999945
.
Edit: The below analysis is potentially incorrect and is potentially an error on my part.
Namely the C# spec is ambiguous as to whether the following definition implies that floor(x / y)
is a fused operation or not:
In the table, x and y are positive finite values. z is the result of x % y and is computed as x – n * y, where n is the largest possible integer that is less than or equal to x / y
I've added an additional note on this elaborating more further down.
This is ultimately a bug in our implementation since we define x % y
to effectively be similar to:
var n = Math.Truncate(x / y);
return x - n * y;
We can't simply do the above algorithm as it may result in issues with regard to rounding since x - n * y
is two operations. However, we could implement it as the following:
var n = Math.Truncate(x / y);
return Math.FusedMultiplyAdd(-n, y, x);
On ARM64 and x86/x64 from ~2013+ this will be effectively three instructions. For x86/x64 this is (noting we should update Math.Truncate
to be intrinsic as part of this):
vdivsd xmm0, xmm1, xmm2
vroundsd xmm0, xmm0, xmm0, 3
vfnmadd213sd xmm0, xmm2, xmm1
For ARM32 we don't currently accelerate FusedMultiplyAdd
, but we could and its available wherever AdvSimd
is supported (which is baseline for ARM64).
For older x86/x64, this would end up being a call to the CRT fma
function; which is what Math.FusedMultiplyAdd
falls back to. Its known to be IEEE compliant and is likely (but I haven't measured yet) not significantly slower than the current fmod
call which results in a fprem
loop. The non-accelerated software implementation is typically something like: https://github.com/ucb-bar/berkeley-softfloat-3/blob/master/source/s_mulAddF64.c. MSVCRT and LibC each have their own implementations, but both are known to be IEEE compliant and aren't terribly slow.
There is a potentially interesting corner case to consider with the Truncate/FMA
algorithm above which is that if the result of Math.Truncate(x / y)
is a whole integer greater than 2^53
, it may not be exactly representable (between 2^53 and 2^54 only even numbers are representable; and then every fourth number for 2^54 through 2^55, etc).
If x % y
is meant to be treated as a "single operation", then technically we'd need to calculate the fully-precise integer result and use that in the subsequent x - n * y
which would be significantly more complex to do (but is detectable and could keep the normal path "small").
If you use that algorithm for 1.0 % 0.1 you do indeed get 0.09999999999999945.
Given that this is our behavior, shoudl we then update our docs to call this out? It feels like that if this is what we do, we should just doc that as preserving what we do feels more important than changing things just to satisfy docs that people couldn't have fully depended on anyways.
.... especially since we - and probably just about every other language out there - are wrapping fmod
...
.... especially since we - and probably just about every other language out there - are wrapping fmod...
There are problems with wrapping fmod
, including that the result is non-deterministic. C# is likely going to require this be deterministic and that means rolling their own implementation of fmod
or moving to the long time spec'd implementation. Either change is going to be "breaking" in some fashion (either breaking the spec or implementation). The benefit of breaking the implementation, is that its already broken and non-deterministic.
@CyrusNajmabadi I've actually gone over this a bit more and there was an issue in what I stated above.
Namely the issue is that the C# spec has an ambiguity here. In particular it defines:
In the table,
x
andy
are positive finite values.z
is the result ofx % y
and is computed asx – n * y
, wheren
is the largest possible integer that is less than or equal tox / y
The ambiguity comes from how to interpret:
where
n
is the largest possible integer that is less than or equal tox / y
In particular consider 1.0 % 0.1
again. 1.0
is exactly representable as a double
while 0.1
is not and is in fact 0.1000000000000000055511151231257827021181583404541015625
.
Now if you do this using the infinitely precise values represnted then 0.1 / 0.1000000000000000055511151231257827021181583404541015625
is 9.999999999999999444888487687421760603063276150361782076232622354...
For double
, the nearest representable integer to this is 10
and so 1.0 / 0.1 == 10
. Therefore if C# intends this to be:
var n = Math.Truncate(x / y);
return x - n * y
The correct answer is 0
. However, this potentially introduces rounding error and other issues.
Where-as if it is intended that x % y
matches the normal IEEE 754 requirements around operations which is that x op y
is done using the exact represented values and computed as if to infinite precision and unbounded range, then rounded to the nearest representable result; then x % y
has to do effectively a FloorDiv(x, y)
which would instead round the infinitely precise answer (9.9999999999999994...
) down to 9
. fma(9, -0.1, 1)
is then 0.0999999999999999500399638918679556809365749359130859375
as expected and as returned by fmod
.
We should likely determine what the intent of the C# spec is here and I should finish doing the math to confirm that fmod(x, y)
is equivalent to fma(fdiv(x, y), -y, x)
(I believe it is).
If the intent is that %
is equivalent to fmod
, updating the spec wording would likely be desirable and would avoid a breaking change.
Description
When given two doubles
x
andy
and the C# operationx % y
, C# (roslyn) will push the values to the stack and emit theREM
instruction. C# defines this as:And the runtime is similar with:
Talking to @tannergooding, we may want to audit this to ensure the impl we have is either following this (and fix if not), or doc better to explain what's going on. For example, if
x=1.0
andy=0.1
then the instructions above allow for the following interpretation:so
1.0 % 0.1
is computed as1.0 - n * 0.1
the largest possible integer that is less than or equal to
1.0 / 0.1
is10.0
, which can be validated here:Console.WriteLine(10.0 <= (1.0 / 0.1))
As such, one interpretation is
1.0 - 10.0 * 0.1
which itself is evaluated to0.0
. This does not match current runtime result of0.09999999999999945
. Which likely means the algorithm is more like:Which can also be considered fine. However, if so, the docs should likely call this out so that the discrepancy can be better understood.
Reproduction Steps
Console.WriteLine(1.0 % 0.1)
Expected behavior
Less clear. Based strictly on the provided docs, either 0 or 0.09999999999999995. This ambiguity isn't great though and we should beef up the docs.
Actual behavior
0.09999999999999995. If desired, we should doc more precisely.
Regression?
No response
Known Workarounds
No response
Configuration
No response
Other information
No response