Open devernay opened 1 year ago
For reference, here are two pytorch implementations of linear_to_srgb:
Have you observed a noticeable difference when using linear_to_srgb
?
Here are some initial tests
Here are some rendered videos. The quality looks fairly similar to me.
Main: https://user-images.githubusercontent.com/3310961/203196445-cbac6ec9-c3ea-4fe6-ace4-6e66ac1028b1.mp4
With change: https://user-images.githubusercontent.com/3310961/203196458-d2abf459-3ba2-4020-9e77-e10e9d8770b5.mp4
Yes, I did that same change in the original (vanilla) NeRF code a long time ago and that didn't change anything either. My guess is that it is compensated by the density field.
In the volume rendering equation, color_linear*density
is very close to color_srgb*density_srgb
, where density_srgb=density * norm(color_linear)/norm(color_srgb)
, especially if the color is grayish and there is not too much view-dependency.
So the end result looks the same, and has about the same PSNR, although the physics are wrong, and the density field may be slightly wrong (at least for values other than 0 or 1), which is not a big deal for novel view rendering. If the density is wrong, you may be able to see a difference at the mesh extraction stage.
They made the adjustment in multinerf, probably because it mattered for Raw-NeRF. The code was there in mipnerf too, but was unused.
I'm not sure if I made my point, maybe my explanation is a bit confusing, but TL;DR:
This is a very common problem in compositing (eg photoshop, VFX, etc), but volume rendering is also some form of compositing.
We should have a blender test set with real semi-transparency to demonstrate these effects.
Figure below is from https://web.archive.org/web/20221029074439/https://blog.johnnovak.net/2016/09/21/what-every-coder-should-know-about-gamma/#colour-blending
Describe the bug Volume rendering (just like any compositing operation) should be done in linear color space, so one can assume that the RGB representation in NeRF should be linear.
Edit: same holds for splatfacto, compositing should be done in linear space.
However, the loss is expressed on sRGB images, which are gamma-compressed.
The original NeRF code didn't care about that detail, but you can see that multinerf now properly handles this (it was necessary for the level of detail that RawNeRF and mip-NerF are able to handle): https://github.com/google-research/multinerf/blob/30005650e9e6a1d8a0f561aa848ea65d855fc787/internal/models.py#L599
Edit: Zip-NeRF has that too.
The linear_to_srgb function is differentiable, and should be added at the end of RGBRenderer.forward(), around here: https://github.com/nerfstudio-project/nerfstudio/blob/ca69adaaec02f347f650804371623198d30af562/nerfstudio/model_components/renderers.py#L113
This should improve details, especially near object boundaries with Mip-NeRF: a pixel covered with 50% white and 50% black should not have a value of 127, but... 255*(0.5^(1/2.2))~=186