fix: prevent NaN from cropping up in testshade for icx (or anyone else)

fpsunflower commented 1 month ago

I'm not following how icx could lead to a NaN here -- cosf() has to return a number in [0,1] so sqrt(1 - z * z) should be well defined, no?

Was this to address the icx CI failure (which still seems to be there) or another case?

AlexMWells commented 1 month ago

technically, just 1 ulp over 1.0 or below -1.0 could cause 1-z*z to be negative. This was just a "possible" source of nan from a code review, so not indicative that is was triggering a nan, just that theoretically it could.

fpsunflower commented 1 month ago

I understand, but if cosine returned a value above 1.0 for small inputs I would consider that a bug with the math library.

lgritz commented 1 month ago

Good guess, but we still get the problem even with this fix (which I suppose I can revert, maybe it was premature to merge?).

I think that the NaNs -- which didn't necessarily come from this statement, though it seemed like a good guess -- are the symptoms rather than the cause of the underlying issue.

I think most or all of the "render" tests fail, but the failure modes are different -- some hit that assertion (which didn't involve the NaN, it was only when added another print at the time of the assertion did I see other things being NaN), others crash outright, others do neither but timeout after over 10 minutes for that one test (maybe stuck in an infinite loop while constructing the BVH?). All only on icx.

I think somebody with icx handy is going to need to look at this in a debugger.

steenax86 commented 1 month ago

I will investigate. To confirm, all these failures are in testsuite/render-*?

lgritz commented 1 month ago

Yeah, if you look at the CI logs for any of the icx failure, you should be able to see which tests are failing. Seems to happen every time, so any of the ones that either crash or hit an assertion ought to be trivial to pinpoint with a debug build and any debugger.

AcademySoftwareFoundation / OpenShadingLanguage

fix: prevent NaN from cropping up in testshade for icx (or anyone else) #1874