mmp / pbrt-v4

Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.
https://pbrt.org
Apache License 2.0
2.89k stars 457 forks source link

NaNs from BxDF::rho() #313

Closed shadeops closed 1 year ago

shadeops commented 1 year ago

NaNs

When doing a few renders of the lovely Kroken scene I noticed a handful of NaNs being produced -

pbrt --displacement-edge-scale 5  --force-diffuse  camera-1.pbrt 

[ tid 1663970 @    29.957s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 3
[ tid 1663970 @    29.957s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 4
[ tid 1663970 @    29.957s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 5
[ tid 1663971 @    29.971s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 3
[ tid 1663971 @    29.971s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 4
[ tid 1663971 @    29.971s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 5

NaN Sources

The source of the NaNs comes from SampledSpectrum BxDF::rho(...) where bs->pdf is equal to 0.

https://github.com/mmp/pbrt-v4/blob/f8f260fe3126acabaa58caeffed96dc901f6e55c/src/pbrt/bxdfs.cpp#L1135-L1140

bs-pdf becomes 0 via different BxDF definitions stemming from random samples, u, being very close to 1.0. For example in DiffuseBxDF::Sample_f() a very close to 1.0 sample can cause wi.z to be 0 through the SampleCosineHemisphere().

https://github.com/mmp/pbrt-v4/blob/f8f260fe3126acabaa58caeffed96dc901f6e55c/src/pbrt/bxdfs.h#L51-L54

Similarly in ConductorBxDF::Sample_f() you can get a pdf equal to 0 through the https://github.com/mmp/pbrt-v4/blob/f8f260fe3126acabaa58caeffed96dc901f6e55c/src/pbrt/bxdfs.h#L316-L317

Possible Fixes?

Have Rho() ignore pdf =0

I'm not sure what the correct fix is. The simplest way to avoid the NaNs would be to update the conditional in SampledSpectrum BxDF::rho(...) to

        if (bs && bs->pdf != 0)
            r += bs->f * AbsCosTheta(bs->wi) / bs->pdf;

Return empty optionals

Alternatively is it better to go to each of the various sources that are generating the pdf == 0 and fix the problem there? Something like -

        Float pdf = CosineHemispherePDF(AbsCosTheta(wi));
        if (pdf == 0)
            return {};

        return BSDFSample(R * InvPi, wi, pdf, BxDFFlags::DiffuseReflection);

But this would require updates to a handful of the BxDFs.

Happy to make a PR or which ever approach or an alternative solution.

Cheers!

pbrt4bounty commented 1 year ago

Hi.. I'm curious to know why I can't reproduce this error here, using my binaries obtained with the Clang 14 compiler. I only get this error when using the MSVC build

pbrt4bounty commented 1 year ago

After a few test, seems that the culprit are bs->wi , not bs->pdf

// Compute estimate of $\rho_\roman{hd}$
pstd::optional<BSDFSample> bs = Sample_f(wo, uc[i], u2[i]);
if (bs) {
    if (AbsCosTheta(bs->wi) == 0.0f)
        printf("Really AbsCosTheta(bs->wi) is 0\n");
    r += bs->f * AbsCosTheta(bs->wi) / bs->pdf;
}

This meant that this operation are really is r += bs->f * 0.0 / bs->pdf which directly causes a NAN error message

F:\devs\pbrt-v4-scenes\kroken>c:/apps/pbrt4/pbrt_gpu.exe --displacement-edge-scale 5  --force-diffuse  camera-1.pbrt
pbrt version 4 (built Dec 25 2022 at 19:59:02)
Copyright (c)1998-2021 Matt Pharr, Wenzel Jakob, and Greg Humphreys.
The source code to pbrt (but *not* the book contents) is covered by the Apache 2.0 License.
See the file LICENSE.txt for the conditions of the license.
Warning: lights.pbrt:7:4: Converting non-RGB "L" parameter to RGB so that a portal light can be used.
Rendering: [++++++                                                                                            ]  (8.3s|120.5s)
AbsCosTheta(bs->wi) and bs->pdf is 0
[ tid 4372 @     0.000s F:\REPOS\pbrt4blender\src\pbrt\cpu\integrators.cpp:266 ] ERROR Not-a-number radiance value returned for pixel (1153, 63), sample 1. Setting to black.
Rendering: [++++++++++++++++++++++++                                                                          ]  (37.0s|112.7s)
AbsCosTheta(bs->wi) and bs->pdf is 0
[ tid 7452 @    29.579s F:\REPOS\pbrt4blender\src\pbrt\cpu\integrators.cpp:266 ] ERROR Not-a-number radiance value returned for pixel (1384, 7), sample 7. Setting to black.
Rendering: [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++                 ]  (126.3s|26.9s)
bs->pdf is 0
[ tid 16676 @   118.281s F:\REPOS\pbrt4blender\src\pbrt\cpu\integrators.cpp:266 ] ERROR Not-a-number radiance value returned for pixel(1124, 963), sample 14. Setting to black.
Rendering: [+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++             ]  (133.6s|19.7s)
AbsCosTheta(bs->wi) and bs->pdf is 0
[ tid 20176 @   125.781s F:\REPOS\pbrt4blender\src\pbrt\cpu\integrators.cpp:266 ] ERROR Not-a-number radiance value returned for pixel(395, 1136), sample 15. Setting to black.
Rendering: [++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++]  (159.0s)
[ tid 4560 @   151.475s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 3
[ tid 4560 @   151.478s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 4
[ tid 4560 @   151.478s util/image.h:429 ] ERROR NaN at pixel 1153,63 comp 5
[ tid 11620 @   151.526s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 3
[ tid 11620 @   151.526s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 4
[ tid 11620 @   151.526s util/image.h:429 ] ERROR NaN at pixel 1124,963 comp 5

Only my two cents.. Cheers Edited: I added the complete test for the both possible zero values

mmp commented 1 year ago

Nice find! As far as the right fix, I think the right place to handle that is in the code that uses the sample, i.e., in rho(). Reviewing the lights and BxDFs, they're a bit inconsistent (e.g. ProjectionLight performs the courtesy of returning an unset optional value if the radiance in a direction is zero, but other lights don't.) So I think it makes sense for things that sample themselves to be allowed to return samples that aren't usable and let the caller decide if a non-unset sample is worthwhile. That also fits with e.g. the caller figuring out if a light sample is on the back side of a non-transmissive surface, etc...

A PR would be much appreciated, @shadeops !