Closed rainbow-app closed 5 days ago
The pixel norm significantly affects images for maxDepth>1. Here are the images of the scene from my old cropwindow issue (same convert command, downsized). Vanilla (=sensor-norm) and pixel-norm.
Direct illumination on the wall is same, I "zoomed" the values to show the darks only.
UPD: that's spp=1. if increased, the difference vanishes.
Briefly: there is a mistake (importance is sensor-normed), and it solves a later problem that is never mentioned. However, the question still remains, how to add the light image.
Longer:
There's no single = "overall" = whole-sensor importance. We have many importances, one for each pixel. Pixel's importance has pixel area in the denominator.
The t=1 strategy doesn't (usually) give sample(s) for the given pixel. Hence, a new theory is required that considers the whole sensor, or an ad-hoc rule on how to add the light image (=splats). Veach vaguely describes his ad-hoc solution in 10.3.4.3 (p.313), and it looks like he's using pixel-norm, but he doesn't argue.
Consider example case: fixed pixel area, some bounded scene in the FOV, and resolution goes to infinity (fov->180*). The number of splats -> infinity, so light image (the scene at the center) becomes statistically superb, w/o noise. However, in Veach theory each pixel is separate=single, so his MIS weights will not account for the superb noise-free light image. So we'd better correct MIS weights (that were computed with W per pixel). This is the problem that is never mentioned in the text.
Basic argument (that I came up with, also vague): for the ad-hoc rule, we are not interested in separate pixel measurements, we sum the whole light image; splats are everywhere, but we are monitoring their contribution only in the sensor area. So a reasonable attempt would be to change the camera pdf density to be wrt sensor area -- for MIS purposes only. It turns out (from some simple formulas) that this ad-hoc change does the above job pretty well. Notice that we don't need per-sensor "overall" importance, just camera pdf.
In pbrt, camera pdf is always per whole-sensor, so you have the MIS weights for summing of light and camera images (erroneously) correct. The measurements on the other hand are needed in two places. (1) when starting a camera subpath, we need the ratio W/p (for beta=1), and so the erroneous per-sensor area cancels. And (2) in splats, where the per-sensor W (=small) is not divided by its p. However in the Film::WriteImage you do not divide the splats by number of pixels (or bdpt samples), like Veach divides the light image. You simply sum contributions, making W large. This non-division effectively results in the pixel area in the denominator of W. So this non-division cancels the whole-sensor mistake for splats.
All this could be seen much easier from formulas, but it'd take more time to present.
Above I tried to justify the pbrt mistake (the sensor norm). However, the question still remains, how to add the light image.
In the 1st post, the pixel-norm image (2nd, fixed) has clearly lower variance than sensor-norm image (1st, vanilla).
In the 2nd post, pixel-norm image (2nd, fixed) converges slower than vanilla.
Do we want to "do it right" (pixels separate = "Veach classic"), ignoring great (if computed) accuracy in light image, or do we want to use the statistical accuracy, but "do it vague"?
(of course, other ad-hoc rules could be considered, or proper solution could be tried to be found)
I'm leaving this open to attract attention, but feel free to close it.
This issue is misplaced (no serious problem with the code; should be for the book), and vague and edited beyond economical repair. I'll create a new one later, hopefully clear.
Hello!
UPD: 2024 nov 12. Three posts here:
Writing to v4, but using v3, I'm sure v4 is affected.
Importance for a single pixel is the same as in textbook, except the area in denominator is pixel area. Brief derivation is here: https://github.com/mmp/pbrt-v3/issues/336#issuecomment-2391946322. (I tried to explain it to someone, thinking I understand it...). It's easy to understand: it is pixels that measure, not the sensor. There's no measurement the whole sensor provides, so no importance for the sensor.
A test case, where the pixel norm is clearly better:
To make pixel norm:
Result, coneangle=30, spp=1, sensor-norm vs. pixel-norm:
How to reproduce
Some hard-coded numbers, for resolution 11x11: size of area that maps to 1 pixel = pxsz = tan(60*pi/180/2)/5.5 = 0.1050 (meters) x coord where we point the light at: xpos = pxsz*4 = 0.4199 x coord where the light is: pxsz*4*2 = 0.8398 distance to the illiminated point: dist = sqrt(xpos^2+1) = 1.0846 coneangle to cover that area (1 pixel): coneangle = pxsz/2*(1/dist)/dist*180/pi = 2.5565
Sampler "random" "integer pixelsamples" [1] Integrator "bdpt" "integer maxdepth" [1]
LookAt 0 0 -1 0 0 0 0 1 0 Camera "perspective" "float fov" [60] Film "image" "integer xresolution" [256] "integer yresolution" [256] "string filename" "wnorm.pfm"
WorldBegin
Material "matte" "color Kd" [.5 .5 .5] "float sigma" [0]
Shape "trianglemesh" "point P" [-1 -1 0 -1 1 0 1 1 0 1 -1 0] "integer indices" [ 0 1 2 2 3 0]
LightSource "spot" "point from" [0.8398 0 -1] "point to" [0.4199 0 0] "float coneangle" [30] "color I" [5 5 5]
WorldEnd