shader-slang / slang

Making it easier to work with shaders
http://shader-slang.com
MIT License
2.19k stars 187 forks source link

fmax and fmin not implemented to metal spec for non-metal targets #5580

Open cheneym2 opened 2 hours ago

cheneym2 commented 2 hours ago

It appears to be an oversight in the implementation of fmax() and fmin() for non-metal targets.

Same issue with fmin is present as described below with fmax().

In the implementation of fmax, metal implements using an intrinsic, so presumably meets the Apple definition of fmax() which is:

Returns y if x < y, otherwise returns x. If one
argument is a NaN, fmax() returns the other
argument. If both arguments are NaNs, fmax()
returns a NaN. If x and y are denormals and the
GPU doesn’t support denormals, either value
may be returned.

It looks like there was an attempt to match that in the fmax implementation for other targets, but for some reason it only checks one of the special conditions, notably, it doesn't return x if isnan(y), and it doesn't return NaN if x and y are both NaN.

__generic<T : __BuiltinFloatingPointType>
[__readNone]
[require(cpp_cuda_glsl_hlsl_metal_spirv, sm_4_0_version)]
T fmax(T x, T y)
{
    __target_switch
    {
    case metal: __intrinsic_asm "fmax";
    default:
        if (isnan(x)) return y;
        return max(x, y);
    }
}
cheneym2 commented 1 hour ago

Also, the logic of fmax3() can be simplified when fmax() is fixed