It appears to be an oversight in the implementation of fmax() and fmin() for non-metal targets.
Same issue with fmin is present as described below with fmax().
In the implementation of fmax, metal implements using an intrinsic, so presumably meets the Apple definition of fmax() which is:
Returns y if x < y, otherwise returns x. If one
argument is a NaN, fmax() returns the other
argument. If both arguments are NaNs, fmax()
returns a NaN. If x and y are denormals and the
GPU doesn’t support denormals, either value
may be returned.
It looks like there was an attempt to match that in the fmax implementation for other targets, but for some reason it only checks one of the special conditions, notably, it doesn't return x if isnan(y), and it doesn't return NaN if x and y are both NaN.
__generic<T : __BuiltinFloatingPointType>
[__readNone]
[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_4_0_version)]
T fmax(T x, T y)
{
__target_switch
{
case metal: __intrinsic_asm "fmax";
default:
if (isnan(x)) return y;
return max(x, y);
}
}
It appears to be an oversight in the implementation of fmax() and fmin() for non-metal targets.
Same issue with fmin is present as described below with fmax().
In the implementation of fmax, metal implements using an intrinsic, so presumably meets the Apple definition of fmax() which is:
It looks like there was an attempt to match that in the fmax implementation for other targets, but for some reason it only checks one of the special conditions, notably, it doesn't return x if isnan(y), and it doesn't return NaN if x and y are both NaN.