Open SRHMorris opened 1 year ago
In pure math, that expression should never be negative. But perhaps it's possible that the limitations of floating-point math could on rare occasions result in a slightly negative value. It does seem most likely with values that are almost equal, as you suggest.
The Academy's CTL reference code in aces-dev also does not have a max() before the sqrt. Interesting that no one has ever reported this before.
But I think you're correct that a max() should probably be done before the square-root. This seems like it could potentially happen on any platform (CPU or GPU), so it should be inserted everywhere.
I've just realised I can also work around this by turning on DXC_ARG_IEEE_STRICTNESS
(-Gis
). So I'm assuming it must be the compiler reordering something. That makes me doubt it would happen on the CPU unless OCIO was compiled with unsafe math options.
I think I might also be in a special case because the HLSL OCIO spits out is for D3D11 and I'm basically running find/replace on it to convert it to something D3D12 can use (and using DXC and shader model 6).
I'm using the "studio-config-v1.0.0_aces-v1.3_ocio-v2.1" config and OCIO v2.2.0. Performing any DisplayView transform produces nans (showing up as black on the screen).
OCIO produces this line of HLSL code for the ACES_Glow10 (Forward) section:
I can work around this issue by replace that line with
sqrt(max(0.0, ...))
. I've not yet found the specific ACES2065-1 values that produce this issue (but by eye they seem to all be close to gray). My source is a 8 bit Rec.709 video converted to ACES2065-1 and then the rgb components scaled by some value > 1 (e.g. 3). I'm doing the scale with a MatrixTransform.It seems like the value in the sqrt can become negative in some circumstances. Possibly due to fp issues? Interestingly this only appears when the shader is compiled with
DXC_ARG_SKIP_OPTIMIZATIONS
(-Od
).This shader is running on an RTX 3080Ti, I've not tested on another machine on windows. I don't get this issue with Metal on mac. But it seems safest to me to just always add that
max
function.