Closed JIANG-CX closed 1 week ago
Regarding your first question, the overlapping region implies that for one object, we might only have 3 images around it in one dataset, while in another dataset, we might have 300 images of the same object. Although the first dataset has a larger sum of edges, the distribution should be the same. During implementation, we divide by the number of pixels to calculate average complexity. However, another approach could be to calculate the number of overlapping regions.
Your question is crucial. It does not address conflicts between labels. Since training involves rasterizing one image at a time, there are no issues during a single iteration. However, we cannot resolve inconsistencies between labels across different images. Although training different labels separately in different Gaussian Splats could result in an all-black scene, our method is more tolerable. Some splats might receive different optimization goals in different iterations, but the training process can proceed smoothly.
k1 corresponds to the shorter axis, a1, while k2 is for the longer axis, a2.
p_i represents perplexity. We know the perplexity and aim to determine a1 and a2. Since k1 is larger, a1 is the shorter axis, and a2 is the longer axis.
Thanks for your reply.
Since the definitions of sx, sy, and sz are the scale factors along the x, y, and z axes respectively, the x, y, and z coordinates do not have a direct relationship with the longer and shorter axes. So it is unclear why the parameter k1 corresponds to the shorter axis and k2 corresponds to the longer axis.
Additionally, I found the symbol definitions to be somewhat confusing in the paper:
(1) There are three different 'p' variables used - one in Eq. 2 and two in Eq. 3. In your previous reply, you stated that 'p' represents the perplexity, but could you please provide the precise definition of each of these three 'p' variables? I think they should have different meaning as you use different symbols.
(2) In Eq. 4, what is the definition of the symbol σ?"
Thanks.
Regarding your question "x, y, and z coordinates do not have a direct relationship with the longer and shorter axes". We indeed considering this problem before. But in implementation, we find we can simply regard, x shortes, y middle, and z longer. Since x,y,z does not represent orientation, instead, quaternion represents the orientation. Assuming x,y,z axis ranking in seuqence will not make our method lose its generality.
For this question: "'p' variables used", indeed, we have different meaning for different p, the Large P in Eq.2 means the overall Perplexity, and small P in Eq.3 means the unit perplexity small p_j. In Eq.3, there are two different P, but actually it is a typo, we will modify it soon.
For your last question, sigma means sigmoid function. We use it to bound the loss.
Thanks for your reply. I have no further questions. Great work!
I have some questions about the details in this paper:
Thanks.