mmp / pbrt-v4

Source code to pbrt, the ray tracer described in the forthcoming 4th edition of the "Physically Based Rendering: From Theory to Implementation" book.
https://pbrt.org
Apache License 2.0
2.89k stars 454 forks source link

BDPT: importance should have pixel norm #456

Closed rainbow-app closed 5 days ago

rainbow-app commented 2 weeks ago

Hello!

UPD: 2024 nov 12. Three posts here:


Writing to v4, but using v3, I'm sure v4 is affected.

Importance for a single pixel is the same as in textbook, except the area in denominator is pixel area. Brief derivation is here: https://github.com/mmp/pbrt-v3/issues/336#issuecomment-2391946322. (I tried to explain it to someone, thinking I understand it...). It's easy to understand: it is pixels that measure, not the sensor. There's no measurement the whole sensor provides, so no importance for the sensor.

A test case, where the pixel norm is clearly better: wnorm-diagram

To make pixel norm:

Result, coneangle=30, spp=1, sensor-norm vs. pixel-norm: wnorm-ws wnorm-wp


How to reproduce

Some hard-coded numbers, for resolution 11x11: size of area that maps to 1 pixel = pxsz = tan(60*pi/180/2)/5.5 = 0.1050 (meters) x coord where we point the light at: xpos = pxsz*4 = 0.4199 x coord where the light is: pxsz*4*2 = 0.8398 distance to the illiminated point: dist = sqrt(xpos^2+1) = 1.0846 coneangle to cover that area (1 pixel): coneangle = pxsz/2*(1/dist)/dist*180/pi = 2.5565

Sampler "random" "integer pixelsamples" [1] Integrator "bdpt" "integer maxdepth" [1]

LookAt 0 0 -1 0 0 0 0 1 0 Camera "perspective" "float fov" [60] Film "image" "integer xresolution" [256] "integer yresolution" [256] "string filename" "wnorm.pfm"

WorldBegin

Material "matte" "color Kd" [.5 .5 .5] "float sigma" [0]

Shape "trianglemesh" "point P" [-1 -1 0 -1 1 0 1 1 0 1 -1 0] "integer indices" [ 0 1 2 2 3 0]

LightSource "spot" "point from" [0.8398 0 -1] "point to" [0.4199 0 0] "float coneangle" [30] "color I" [5 5 5]

WorldEnd

rainbow-app commented 2 weeks ago

The pixel norm significantly affects images for maxDepth>1. Here are the images of the scene from my old cropwindow issue (same convert command, downsized). Vanilla (=sensor-norm) and pixel-norm. z-ws-r hall-wp-res

Direct illumination on the wall is same, I "zoomed" the values to show the darks only.

UPD: that's spp=1. if increased, the difference vanishes.

rainbow-app commented 2 weeks ago

Briefly: there is a mistake (importance is sensor-normed), and it solves a later problem that is never mentioned. However, the question still remains, how to add the light image.

Longer:

There's no single = "overall" = whole-sensor importance. We have many importances, one for each pixel. Pixel's importance has pixel area in the denominator.

The t=1 strategy doesn't (usually) give sample(s) for the given pixel. Hence, a new theory is required that considers the whole sensor, or an ad-hoc rule on how to add the light image (=splats). Veach vaguely describes his ad-hoc solution in 10.3.4.3 (p.313), and it looks like he's using pixel-norm, but he doesn't argue.

Consider example case: fixed pixel area, some bounded scene in the FOV, and resolution goes to infinity (fov->180*). The number of splats -> infinity, so light image (the scene at the center) becomes statistically superb, w/o noise. However, in Veach theory each pixel is separate=single, so his MIS weights will not account for the superb noise-free light image. So we'd better correct MIS weights (that were computed with W per pixel). This is the problem that is never mentioned in the text.

Basic argument (that I came up with, also vague): for the ad-hoc rule, we are not interested in separate pixel measurements, we sum the whole light image; splats are everywhere, but we are monitoring their contribution only in the sensor area. So a reasonable attempt would be to change the camera pdf density to be wrt sensor area -- for MIS purposes only. It turns out (from some simple formulas) that this ad-hoc change does the above job pretty well. Notice that we don't need per-sensor "overall" importance, just camera pdf.

In pbrt, camera pdf is always per whole-sensor, so you have the MIS weights for summing of light and camera images (erroneously) correct. The measurements on the other hand are needed in two places. (1) when starting a camera subpath, we need the ratio W/p (for beta=1), and so the erroneous per-sensor area cancels. And (2) in splats, where the per-sensor W (=small) is not divided by its p. However in the Film::WriteImage you do not divide the splats by number of pixels (or bdpt samples), like Veach divides the light image. You simply sum contributions, making W large. This non-division effectively results in the pixel area in the denominator of W. So this non-division cancels the whole-sensor mistake for splats.

All this could be seen much easier from formulas, but it'd take more time to present.


Above I tried to justify the pbrt mistake (the sensor norm). However, the question still remains, how to add the light image.

In the 1st post, the pixel-norm image (2nd, fixed) has clearly lower variance than sensor-norm image (1st, vanilla).

In the 2nd post, pixel-norm image (2nd, fixed) converges slower than vanilla.

Do we want to "do it right" (pixels separate = "Veach classic"), ignoring great (if computed) accuracy in light image, or do we want to use the statistical accuracy, but "do it vague"?

(of course, other ad-hoc rules could be considered, or proper solution could be tried to be found)

I'm leaving this open to attract attention, but feel free to close it.

rainbow-app commented 5 days ago

This issue is misplaced (no serious problem with the code; should be for the book), and vague and edited beyond economical repair. I'll create a new one later, hopefully clear.