Add `clip` to the specification

kgryte commented 7 months ago

This PR

resolves https://github.com/data-apis/array-api/issues/482 by adding clip to the Array API specification for clamping each element of an input array to a specified range.
requires that the output data type be the same as the input array. One could argue that the output data type follow type promotion rules (between x, min, max). This would be in the spirit of earlier efforts to ensure that type promotion rules are applied consistently throughout the specification. However, in contrast to https://github.com/data-apis/array-api/pull/201, min and max are kwargs, and we do not have, TMK, any precedent for array kwargs influencing the data type of the output array. Furthermore, when clamping, users are more likely to want an output array of the same data type as the input array (this was also raised on the NumPy issue tracker: https://github.com/numpy/numpy/issues/24976).
allows the input array x to be broadcast, thus allowing the output array to have a rank greater than the input array. This differs from TensorFlow, which requires that the output array shape be the same as the input array shape. NumPy, however, supports such broadcasting behavior. Note that x can be broadcast is somewhat at odds with not allowing type promotion. For the output data type, I argued that min and max should not affect the output data type, but, in allowing x to be broadcast, this would mean that min and max should affect the output array shape. This is likely fine and consistent with the rest of the specification, where we have plenty of kwargs which affect the output array shape, although this would be the first, TMK, involving broadcasting.
specifies that if min > max, behavior is unspecified. NumPy et al set output values to max; however, other implementations should be free to raise an exception or support alternative behavior.
allows both min and max to be optional. When both min and max are None, the function is essentially a no-op. This follows PyTorch, but differs from NumPy which allows min and max be None, but not at the same time.
does not prohibit mixed data types, but leaves behavior unspecified (e.g., when x is an integer data type and min or max is a floating-point data type), which is consistent elsewhere in the specification. TensorFlow raises an exception in such a scenario.
makes min and max positional and keyword arguments.
Follows NumPy, JAX, CuPy, and Dask in naming the API clip. TensorFlow uses the name clip_by_value. PyTorch also includes clip, but this aliases to clamp.

Note that this PR would introduce changes to existing clip functionality in NumPy et al. Namely,

min and max are positional and keyword arguments; whereas, in NumPy, a_min and a_max are positional.
deviates from NumPy's naming convention of a_min and a_max.
NumPy requires that only one of min or max be allowed to be None at the same time.

kgryte commented 5 months ago

I've updated this PR based on feedback from the 30 November 2023 workgroup meeting. Namely,

I renamed the API from clamp to clip. Given widespread usage of this API, it was considered undesirable to rename and deprecate clip in NumPy et al. Even though this PR introduces behavior which differs from NumPy (e.g., kwargs, output data type, etc), it was considered better if NumPy simply moved to adopt specified behavior.
I updated the guidance regarding the output data type to be the same as the input array x and not the result of type promotion. An argument could be made either way; however, as discussed during the workgroup meeting, user expectation is most likely to be that the output data type matches x, and the specification does not have precedent (TMK) for array kwargs influencing the output data type. As such, specification was guidance was updated accordingly. Note, however, that this differs from current behavior in, e.g., NumPy.

kgryte commented 5 months ago

@rgommers I've updated min and max to be both positional and kwarg. And I added a note stating that arguments having different data type kinds will result in implementation-dependent behavior.

kgryte commented 5 months ago

As this PR has received the OK and has not received any further comments, will go ahead and merge. Thanks all!

data-apis / array-api

Add `clip` to the specification #715