Book 3.6.2: Missing actual change to the probability function

TheThief commented 1 year ago

It's not very clearly explained IMO, but the pdf value you're multiplying by is the value from the ideal pdf you want to use for the surface, and the one you divide by is the value from the pdf you've actually used to generate the rays (which can e.g. be biased towards lights for importance sampling).

But in 6.2 as it stands at the time of writing (4.0.0-alpha.1 I believe) you're only changing the value you're dividing by to a uniform value, but still generating with the cosine distribution inside the material! So they don't match, and instead of correcting for the distribution being used it's actually altering it to a double cosine reflectance by accident.

The change you make in 6.3 to change to generate rays using a uniform hemisphere should be in 6.2 as well, to get the effect you're claiming of giving the same result but taking longer to converge (aka more noise).

Also, a lot of the code past this point is using the material function that returns a scatter ray in an out parameter but then throws that away and generates another one it actually uses immediately afterwards, which is inefficient

hollasch commented 7 months ago

From (now closed #1272):

In 6.2 Using a Uniform PDF Instead of a Perfect Match, the scatter ray is sampled in Lambert Distribution in rec.mat->scatter(), but the pdf value is set to 1/(2*pi). I think it is against the Monte Carlo Theory. Can It still converge to the correct answer? As the book said This will still converge on the correct answer. I'm confused about this.

dimitry-ishenko commented 6 months ago

FWIW, I got different results between 6.1 and 6.2.

6.1 img-3 6 1

6.2 img-3 6 2

This seems to make sense, actually. In 6.1 we multiply and divide by the same value of the PDF, which nullify each other. In 6.2 brighter rays are penalized less, relative to the darker ones since all are divided by the same amount.

I think this sentence in 6.2:

We will sample using a uniform PDF about the hemisphere, so we'll set the denominator to 1/2𝜋. This will still converge on the correct answer ...

goes against what was said earlier in the book. For example, when talking about directing rays towards the light in 3.3:

We could lessen this problem if we steered both of these rays toward the light, but this would cause the scene to be inaccurately bright. ... We can remove this inaccuracy by downweighting those samples to adjust for the over-sampling.

dimitry-ishenko commented 6 months ago

To make sure the problem was not with my implementation, I took the code from TheNextWeekend folder and applied the changes introduced in 6.1 and 6.2:

diff --git a/src/TheNextWeek/camera.h b/src/TheNextWeek/camera.h
index c8580d20..cfb8de7a 100644
--- a/src/TheNextWeek/camera.h
+++ b/src/TheNextWeek/camera.h
@@ -151,7 +151,12 @@ class camera {
         if (!rec.mat->scatter(r, rec, attenuation, scattered))
             return color_from_emission;

-        color color_from_scatter = attenuation * ray_color(scattered, depth-1, world);
+        double scattering_pdf = rec.mat->scattering_pdf(r, rec, scattered);
+        //double pdf = scattering_pdf;
+        double pdf = 1 / (2*pi);
+
+        color color_from_scatter =
+            (attenuation * scattering_pdf * ray_color(scattered, depth-1, world)) / pdf;

         return color_from_emission + color_from_scatter;
     }
diff --git a/src/TheNextWeek/main.cc b/src/TheNextWeek/main.cc
index b0191da8..a0b1a108 100644
--- a/src/TheNextWeek/main.cc
+++ b/src/TheNextWeek/main.cc
@@ -401,7 +401,7 @@ void final_scene(int image_width, int samples_per_pixel, int max_depth) {

 int main() {
-    switch (0) {
+    switch (7) {
         case 1:  bouncing_spheres();          break;
         case 2:  checkered_spheres();         break;
         case 3:  earth();                     break;
diff --git a/src/TheNextWeek/material.h b/src/TheNextWeek/material.h
index 1b168879..8b6cac84 100644
--- a/src/TheNextWeek/material.h
+++ b/src/TheNextWeek/material.h
@@ -31,6 +31,11 @@ class material {
     ) const {
         return false;
     }
+
+    virtual double scattering_pdf(const ray& r_in, const hit_record& rec, const ray& scattered)
+    const {
+        return 0;
+    }
 };

@@ -52,6 +57,12 @@ class lambertian : public material {
         return true;
     }

+    double scattering_pdf(const ray& r_in, const hit_record& rec, const ray& scattered)
+    const override {
+        auto cos_theta = dot(rec.normal, unit_vector(scattered.direction()));
+        return cos_theta < 0 ? 0 : cos_theta/pi;
+    }
+
   private:
     shared_ptr<texture> tex;
 };

And, I got results similar to mine:

Unmodified code from TheNextWeekend:

Applied changes from 6.1, which should result in an identical image:

Applied change from 6.2:

According to 6.2:

You should get a very similar result to before, only with slightly more noise, it may be hard to see.

But we don't... We get brighter image. 😕 😖

tGautot commented 3 months ago

TLDR: The code from section 6.2 is not valid and causes the program to converge to a different solution, not one respecting the equation we are trying to solve. There needs to be a change to the scatter function alongside the change to the pdf function. If you enjoy the math down below I would be happy to write a PR adding some of it to the book

To follow-up on the discussion, looking at the math more precisly: the code from section 6.2 should not converge to the same solution as the code of 6.1 and 6.2 does in fact execute a bad monte-carlo integration. I'll try to show this with some math (doing a summary of everything needed up to that point), it's a good exercices for me and hopefully will convice some people of why the results are (expectedly) different.

When given an integral of a function $f$ over a domain $D$, the naive Monte Carlo integration says to uniformly (<- VERY important) sample points of the domain and sum evaluations of the function at those points, the basic intuition is that we find the average of the funciton over the domain, and multiply this average by the size of the domain:

$$ \int_D f(\mathbf{x})d\mathbf{x} \approx \frac{S}{N} \sum_1^N f(x_i) $$

In the above formula $x_i$ are the points sampled uniformly (<- VERY important) from the domain and $S$ is the surface of the domain (or (hyper)volume in higher dimensions). In the code we try to get the (approximate) value of the following integral, which is over some part (a hemisphere) of a sphere:

$$ Color_o(\omegao) = \int{\omega_i} A(\omega_i, \omega_o)pScatter(\omega_i, \omega_o)Color_i(\omega_i, \omega_o) $$

Combining the two formulas we should expect:

$$ Color_o(\omega_o) \approx \frac{2\pi}{N} \sum_1^N A(\omega_i, \omega_o)pScatter(\omega_i, \omega_o)Color_i(\omega_i, \omega_o) $$

Notice that the right side, looks almost exactly like what is given in the code of section 6.2. The bad new comes from the VERY important need for UNIFORM distribution of the sampled points over the domain, however in 6.2 $\omega_i$ is obtained via the unmodified scatter function, meaning a $cos(\theta)$ distribution, not a uniform one. Hence why the results differ so much (in brightness, not just noise).

This could already conclude this explanation, but I'll be a pain for a little longer. Some people might not be convinced with my explanation since I have gone from the uniform monte-carlo, obtained a formula very similar to the one in 6.2 then just pointed at the difference, although in 6.2 the change is justified as being for non-uniform monte-carlo.

Let's just work this quickly, using the following change of variables (IDC, is, like in the book, the Inverse Cumulative Density Function of $p$, and maps $[0,1]\rightarrow D$):

$$ \displaylines{x = IDC(u) = P^{-1}(u) \ P(x) = u \ \downarrow \text{derive both sides} \downarrow \ p(x)dx = du \ dx = \frac{du}{p(x)} = \frac{du}{p(IDC(u))}} $$

Plugging this change of variables into the initial integral gives:

$$ \int_D f(\mathbf{x})d\mathbf{x} = \int_0^1 \frac{f(IDC(u))}{p(IDC(u))} $$

Then, applying the uniform monte-carlo to this second integral gives the following (once again, $u_i$'s need to be uniformly distributed)

$$ \int_0^1 \frac{f(IDC(u))}{p(IDC(u))} \approx \frac{1}{N} \sum_1^N \frac{f(IDC(u_i))}{p(IDC(u_i))} $$

From this we can see that indeed, the numerator (the function to evaluate) should have the same input as the pdf. In 6.2 $p = \frac{1}{2*pi}$ thus it should be that $ICD(u) = \text{uniform over hemisphere}$, but as said in previous comments, the scatter function wasn't changed to provide this behaviour.

I'll conclude this overly long comment by a side-suggestion, I feel like adding some slightly stricter math (as I have written here) to the book, even as sidenotes or foot-chapters could go a long way to help some people (like me) consolidate their understanding of the material, and would helpmore people catch issues like this one in their own work. If you feel the same, I will open a new issue for this and start working on a PR.

RayTracing / raytracing.github.io

Book 3.6.2: Missing actual change to the probability function #1302