Closed Bin-ze closed 1 year ago
Thank you so much for your great work! I have read the paper, but there are some things I don't understand, I would like to ask:
- How to predict opacity ๐ผ? For each 3d Gaussian, can it be obtained through an activation function through a learnable parameter?
There is no prediction we optimize the alpha as every other property of the gaussians, this is just using gradient descent over all the parameters of all the gaussians including alpha such that it minimizes the loss.
- The abstract part mentioned: "preserve desirable properties of continuous volumetric radiance fields for scene optimization", how to understand the appropriate relationship?
This is a rather abstract question
- Chapter 2.2 of the thesis mentioned: "The use of MVS-based geometry is a major drawback of most of these methods", how to understand MVS-based geometry? Is it a sparse point cloud?
MVS usually means any geometry after the dense correspondance and outlier clean up of multi-view stereo, this often is a 3d mesh but could also be a dense point cloud. its a drawback because often is polluted by many errors
- Chapter 2.3 of the thesis mentions: "Our rasterization respects visibility order in contrast to their order-independent method." How to understand visibility order?
this just tries to explain that because of alpha blending order and visibility is respected in contrast to order-independed transparency that does a weighted average using the inverse depth as a weighting factor to give priority to the front-most points
- What is fast ๐ผ-blending? I am new to computer graphics
not sure where this is referred but i would take a wild guess and say that its nothing too fancy, just a fast implementation of alpha blending
- Why can covariance be described by R matrix and S matrix?
I would suggest you to read more carefully the paper in this matter and try to think about it from a linear algebra perspective. Try to think what is a covariance matrix what properties it has and convince yourself why RSR^TS^T is always a covariance matrix
- Chapter 2.3 of the thesis mentions: "computes the color ๐ถ of a pixel by blending N ordered points overlapping the pixel." How to understand N? Can it be understood as the amount of Gaussian overlap in the direction of a ray passing through a pixel?
Yes
- 3d Gaussian is used as the basic element, so what is the physical meaning of the output value of 3d Gaussian? Can it be understood as a probability distribution to describe the opacity probability centered on this point?
Yes it could, but not exactly, mostly because it won't integrate to 1, we skip the normalisation step with the determinant to allow big opaque gaussians
- Paper 5.1 mentioned: "Inevitably, geometry may be incorrectly placed due to the ambiguities of 3D to 2D projection." What does ambiguities refer to here? I think 3d to 2d is a definite process, does it refer to the error caused by floating point precision, or the 3d points are collinear?
3d to 2d is definite you are right maybe this is not the most successful phrasing but it tries to say that extracting 3d geometry from 2d projection is ambiguous
- Paper 5.1 mentioned: "An effective way to moderate the increase in the number of Gaussians is to set the ๐ผ value close to zero every ๐ = 3000 iterations", ๐ผ means opacity, learned by the network, here every 3000 steps Set it to 0, is to initialise directly? So relearn ๐ผ? I checked the code. But can't understand why it does this.
First of all not everything is networks :) Because we do gradient descent doesn't mean there is a network somewhere. This is a rather unimportant trick that we saw that sometimes helps in two ways: first when we reset alpha the optimisation will only increase the alpha of the gaussians that are rather necessary and it gives the opportunity to prune the gaussians that stay with low alpha, second often when floaters appear images get stuck in local minima because rays terminate early on the floaters and there is not chance for the optimisation to see that behind we have a perfectly reconstructed scene, resetting all alpha values allows for a small window of opportunity to the optimisation to converge to a better local minima by removing the floater.
Sorry for asking so many questions, looking forward to your reply๏ผ
Best, George
Thank you very much for your wonderful reply, I still have some questions to ask:
Best๏ผ binze
Hth, Bernhard
thank you for your reply
Thank you so much for your great work! I have read the paper, but there are some things I don't understand, I would like to ask:
Sorry for asking so many questions, looking forward to your reply๏ผ