Closed nishida-naoki closed 1 year ago
This seems to be a numerical stability issue. Perhaps small values have occurred in some places, leading to problems like division by zero.
Strategy One: One approach is to directly remove these points from the parquet file. Using pandas should make this process straightforward. A more elegant method would be to eliminate these points in the parquet-saving logic. After all, these points, whenever they appear, will affect the inference results, so we shall always remove them if they appear.
Strategy Two: Directly remove points with nan/inf values during training. The advantage of this approach is that it can mitigate the impact of these inf points by training. The current program handles the nan situation here: https://github.com/wanmeihuali/taichi_3d_gaussian_splatting/blob/f7631e327d3e6e995324f1755bc3da25d603a584/taichi_3d_gaussian_splatting/GaussianPointAdaptiveController.py#L205, where it periodically removes points with a nan alpha value. However, it seems I may have forgotten to consider the inf scenario. Could you try using torch.isinf and employ a method similar to the code referenced above to remove points with inf values?
Strategy Three: Locate areas where division by zero might be occurring. The calculation for alpha grad is here: https://github.com/wanmeihuali/taichi_3d_gaussian_splatting/blob/f7631e327d3e6e995324f1755bc3da25d603a584/taichi_3d_gaussian_splatting/GaussianPointCloudRasterisation.py#L605. I just revisited it and couldn't immediately identify places where division by zero might happen. The adapter controller doesn't modify alpha either. we may need to do breakpoint debugging to locate this bug... I'll try to look into it tomorrow or when I have time.
Thank you for your quick reply!
I tried Strategy One, but it did not work well. I eliminated suspected rows, e. g. inf, nan, too small value and too large value, but artifacts remained. I found that features in normal range are one of sources of these artifacts, while features with nan or inf can also cause the same kind of artifacts.
I am now trying Strategy Two as you suggested. My workaround is as follows:
It seems that simple addition of torch.isinf
does not work well, because features in normal range can cause artifacts.
The exact line which makes alpha values inf
seems to be the following line. The reason of inf
is the argument of ti.exp()
, which is determined by the combination of two variables conic
and xy_mean
. It seems that both of them cannot be a single indicator of the artifacts (e. g. setting threshold to conic
did not work well).
Hi @nishida-naoki, how do you set the threshold to conic? I just noticed that the div by zero can happen when computing the conic: https://github.com/wanmeihuali/taichi_3d_gaussian_splatting/blob/f7631e327d3e6e995324f1755bc3da25d603a584/taichi_3d_gaussian_splatting/utils.py#L255 Have you tried to prevent det_cov from being zero? e.g. if (ti.math.abs(det_cov) < 1e-5) det_cov = ti.math.sign(x) * 1e-5.
@wanmeihuali I encounter the same issue. It actually happens quite frequently based on a few examples I tried. I tried to limit det_cov, didn't work for me ...
@nishida-naoki Do you have a fix for your example?
@jb-ye Hi, For better visualization, insert two lines to here was enough for me:
+ if abs(gaussian_alpha) >= np.inf:
+ continue
But filtering invalid gaussians out while training seems not to be so simple, because abs(gaussian_alpha) >= np.inf
does not always mean invalid inv_cov
or other intermediate variables.
We can close this issue as the root cause is fixed in the latest PR #153
Thanks a lot for your help! @jb-ye @wanmeihuali
@wanmeihuali
I found tiles-like artifacts when visualizing some parquet files, as shown in the attached image.
After my short inspection, I found that
gaussian_alpha
of the following lines takes value of inf in these pixels, which leads to the occupation of whole one tile by one color. https://github.com/wanmeihuali/taichi_3d_gaussian_splatting/blob/main/taichi_3d_gaussian_splatting/GaussianPointCloudRasterisation.py#L403-L407Do you have any idea on this issue or any suggestion for further inspection?
To reproduce the issue, please download the attached parquet file and run
python3 visualizer.py --parquet_path_list refined.parquet
.By the way, thank you for sharing your great project! 😄
parquet.zip