Issue with numerical consistency of renderings

Jovp commented 3 years ago

Hello, First thank you for this great piece of work.

We are working on some graphics pipeline and while testing the path tracer it seems like we found some inconsistencies in the renderings. More specifically when rendering from inside a big sphere lit from the inside by a point light source located at (0,0,0) and the sphere centered at (0,0,0) with diffuse (1,1,1) material, we found that the intensity of the rendering is not decreasing with the square of the sphere radius. We place a camera within the sphere and render an image, which as expected has constant values. Here are the values found for a light source of intensity 100:

sphere radius 5: rendered value 22.95
sphere radius 10: rendered value 6.066
sphere radius 100: rendered value 0.0702

Of course, the mesh is not perfectly spherical, but I feel we should not observe such variations, the 0.0702 should be 100 times smaller than 6.066, meaning almost 15% error.

Thanks for your help, Julien

natevm commented 3 years ago

Hi Julien,

Yes, light intensity is an ongoing challenge for us.

We do not use an exact r^2 falloff due to the singularity at r=0, which causes unwanted NaNs and Infinities. Instead, we clamp r to be at least 1, which loses energy for distances less than 1, but is numerically stable and more artist directable. See the relevant code here: https://github.com/owl-project/NVISII/blob/54bd3d61a176d8b531bdcecdd0c7757aed22c5fa/src/nvisii/devicecode/path_tracer.cu#L1302

Additionally, the size of a triangle normally effects the solid angle of the light source, but this behavior is disabled by default, as a user study we conducted internally suggested this makes scene randomization for training NNs prohibitively difficult. The current behavior is made to follow Renderman light sources, but with area normalization on. The feature to account for surface area can be turned on here though: https://nvisii.com/light.html#nvisii.light.use_surface_area

If you are able to, please look over the rest of the other tracer, and see if you can spot any errors in the logic employed there. Unfortunately I’m the only one who has looked at the code, and I could use an additional pair of eyes. The code for the path tracer should be relatively self contained, where we do a simple NEE + MIS strategy.

natevm commented 3 years ago

Also, if you can, would you be willing to share your code to reproduce this energy test? I’d imagine that would be quite useful for integration testing down the line.

Jovp commented 3 years ago

I see, very interesting to get your feedback so quickly!

I looked at the code you mentioned and I get it, but in our test case, everything is at more than 1 of distance from the light source, which is a point light, so this should not affect the falloff isn't it?

Unfortunately, we cannot share our code at this stage, but will come back to you when we can! Thanks again for helping!

Jovp commented 3 years ago

I will try to take a look at the logic of the path tracer and see if I can spot any issues.

natevm commented 3 years ago

Yup, I think that logic holds.

I would not be surprised if we have some quirks still with energy that need to be worked out. Thank you for your efforts. 🙏

(accidentally closed issue, on mobile. should be reopened now)

Jovp commented 3 years ago

I took a quick look, and I am probably not as expert as you are but here are a few questions/hints:

In the case of a point light source, I don't understand the pdf: code As it is a point, I am not sure it can be treated with a pdf other than a Dirac of 1. Especially that means that the distance to the point light source also influences the factor w here: code This could be the reason for the issue I mentioned.

Other question: Why is the pdf divided by two when there is a single light source: code

natevm commented 3 years ago

I am mostly self-taught, so I'm not much of an expert either. haha

In the case of a point light source, I don't understand the pdf: code

This doesn't make much sense to me either, but my objective with point lights is to ultimately match the energy of the equivalent mesh light, assuming that surface area does not affect light intensity. I found that I needed to add this "PdfAtoW" code to get my point light sources to match the intensity of a spherical mesh light of unit size. Without that PDF, point light sources end up being several orders of magnitude darker than triangle mesh lights. Perhaps this PDF value should instead be set to 1 if light.enable_surface_area is used?

Fwiw, this PdfAtoW function is defined here: https://github.com/owl-project/NVISII/blob/54bd3d61a176d8b531bdcecdd0c7757aed22c5fa/src/nvisii/devicecode/lights.h#L7 I believe this code was taken from a shadertoy somewhere. The pdf of an emissive triangle is also located in that file around line 28

I think one issue with this code is that it assumes that the solid angle of a light source decreases in size with a falloff of r^2, when A) really this function should accept an arbitrary falloff (we aim for artist directability over physical correctness, and arbitrary falloff values are important for matching raster images to ray traced ones.) and B) we account for this falloff again in the accumulated light intensity here https://github.com/owl-project/NVISII/blob/54bd3d61a176d8b531bdcecdd0c7757aed22c5fa/src/nvisii/devicecode/path_tracer.cu#L1302.

So it feels a bit like we're doing a falloff twice, which is bad...

The 1.f/(4.f M_PI) there for point lights is basically stating, make this point light match a unit-sized sphere light, whose area is 1.f/(4.f M_PI), since r of a unit sphere is 1.

Why is the pdf divided by two when there is a single light source: code

Good question. Just talking through it, numLights is equal to the number of light entities instanced in the scene, but does not account for the additional dome light that surrounds the scene that is directly sampled. If the dome light sampling is enabled, I believe this code makes sense. However, if dome light sampling is disabled, I think that +1 there should be conditionally disabled.

natevm commented 3 years ago

For starters, what I could do is set the PDF of a point light to be 1 when surface area is enabled, and use that as a sort of "ground truth" for the intended energy behavior of mesh lights. From there, I can try getting a single emissive triangle with all vertices at the same location to produce the same energy as a point light (also assuming surface area is enabled).

This raises a strange question though. If the area of a triangle is 0, then the area of that triangle would go to 0, and so pdfA would become infinity (1 / area). PdfAtoW would take that infinite value and return an infinite probability. Dividing the intensity of the mesh light by an infinite probability would result in an intensity of 0, which would fundamentally disagree with the energy of a point light. Its almost like pdfA should be 1 / (area + 1) to prevent that singularity issue...

Jovp commented 3 years ago

Here are some thoughts: I assume you need some form of a reference for when surface_area is disabled. i.e. When you set the intensity of a sphere light source of radius r to 1, you know that the surface is 4pir^2 and you have a reference of the amount of energy E emitted in that case, that is independent of r. What you want, if you want to replicate this amount of energy for a point light, is that the Intensity I of the point light is premultiplied by a factor f such that 4pif=E, ie f=E/(4pi). And in that case you should be able to drop the weird pdf.

owl-project / NVISII

Issue with numerical consistency of renderings #109