data-apis / array-api

RFC document, tooling and other content related to the array API standard
https://data-apis.github.io/array-api/latest/
MIT License
205 stars 42 forks source link

The `round` function specification doesn't say what should happen to values exactly halfway between integers #726

Closed jpivarski closed 6 months ago

jpivarski commented 6 months ago

Here is the current draft of the round function:

https://github.com/data-apis/array-api/blob/5c2423a96c9fd5113e405df88f2422f14cd978b9/src/array_api_stubs/_draft/elementwise_functions.py#L2191-L2231

It doesn't say whether numbers like 1.5 and 2.5 should be rounded to 1.0 and 2.0 or 2.0 and 3.0. Sometimes, the rule is to always round up (toward positive infinity? away from zero?), but that can bias distributions to higher values. Sometimes, the rule depends on the evenness or oddness of the whole part, which would eliminate this bias if the distribution is not correlated with evenness/oddness (e.g. if it's broad/smooth on the scale of 1 unit).

In particular, NumPy uses the evenness/oddness rule, in both its traditional API and its experimental implementation of the Array API:

>>> import numpy as np
>>> import numpy.array_api as xp
<stdin>:1: UserWarning: The numpy.array_api submodule is still experimental. See NEP 47.

>>> np.round(np.asarray([-3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 3.5]))
array([-4., -2., -2., -0.,  0.,  2.,  2.,  4.])

>>> xp.round(xp.asarray([-3.5, -2.5, -1.5, -0.5, 0.5, 1.5, 2.5, 3.5]))
Array([-4., -2., -2., -0.,  0.,  2.,  2.,  4.], dtype=float64)

In my opinion, round should be specified to do what NumPy does: use the evenness/oddness rule. This could be a "note" box, similar to the box describing what happens for complex numbers.

kgryte commented 6 months ago

@jpivarski Thanks for filing this issue. If I am reading the specification correctly, the specification does, in fact, indicate what should happen in the event of ties. Namely, in the special cases section for floating-point operands, the specification states:

If two integers are equally close to x_i, the result is the even integer closest to x_i.

Hence, for 1.5 and 2.5, in both cases, the answer would be 2.0.

This behavior matches NumPy:

In [3]: import numpy as np

In [4]: np.round(1.5)
Out[4]: 2.0

In [5]: np.round(2.5)
Out[5]: 2.0
jpivarski commented 6 months ago

Ah! I missed that under "special cases."

https://github.com/data-apis/array-api/blob/5c2423a96c9fd5113e405df88f2422f14cd978b9/src/array_api_stubs/_draft/elementwise_functions.py#L2227

My request here was to get a line like that added, but now I see that it's already there, so I close the issue. Thanks for pointing it out!