Closed wbstx closed 2 months ago
This line is actually a bug that needs to be addressed. An interesting point to note is that this bug would cause the third scaling factor to become -inf, resulting in the gradient being undefined (None). Consequently, the third value would remain unchanged and the scaling after activation is zero.
Thanks for your quick reply! I just realized that the third component will keep 0 also due to the exp activation, as we expect exp(x) to be zero even though log(0) is undefined. But it is interesting to see that things go on in an unexpected way.
Thanks for your work! I have some questions about how the third component of scale is set zero through optimization. I checked the code and find that you add a line of code for scale initialization in gaussian_model.py L135
scales[:, 2] = inverse_sigmoid(torch.tensor(0))
I have two questions:
I would really appreciate your reply, Thanks!