facebookresearch / DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
http://densepose.org
Other
6.95k stars 1.3k forks source link

Why write the code so complicated when getting pairwise geodesic distances #202

Closed Johnqczhang closed 5 years ago

Johnqczhang commented 5 years ago

I'm reading the code about evaluating estimated body uv with ground-truth. I found it very difficult to understand the code of getting pairwise geodesic distances between ground-truth and estimated mesh points ($DENSEPOSE/detectron/datasets/densepose_cocoeval.py: getDistance()) without any comment. Can somebody tell me why the geodesic distances are computed in this way?

def getDistances(self, cVertsGT, cVerts):    
    ClosestVertsTransformed = self.PDIST_transform[cVerts.astype(int)-1]
    ClosestVertsGTTransformed = self.PDIST_transform[cVertsGT.astype(int)-1]
    #
    ClosestVertsTransformed[cVerts<0] = 0
    ClosestVertsGTTransformed[cVertsGT<0] = 0
    #
    cVertsGT = ClosestVertsGTTransformed
    cVerts = ClosestVertsTransformed
    #
    n = 27554
    dists = []
    for d in range(len(cVertsGT)):
        if cVertsGT[d] > 0:
            if cVerts[d] > 0:
                i = cVertsGT[d] - 1
                j = cVerts[d] - 1
                if j == i:
                    dists.append(0)
                elif j > i:
                    ### this piece of codes could have been written better! ###
                    ccc = i
                    i = j
                    j = ccc
                    i = n-i-1
                    j = n-j-1
                    k = (n*(n-1)/2) - (n-i)*((n-i)-1)/2 + j - i - 1
                    k =  ( n*n - n )/2 -k -1
                    dists.append(self.Pdist_matrix[int(k)][0])
                else:
                    i= n-i-1
                    j= n-j-1
                    k = (n*(n-1)/2) - (n-i)*((n-i)-1)/2 + j - i - 1
                    k =  ( n*n - n )/2 -k -1
                    dists.append(self.Pdist_matrix[int(k)][0])
            else:
                dists.append(np.inf)
    return np.array(dists).squeeze()
vkhalidov commented 5 years ago

@Johnqczhang this is a manual implementation of numpy.triu_indices - a compact representation of symmetric matrices. Used to cache and fetch precomputed geodesic distances

Johnqczhang commented 5 years ago

@vkhalidov Thanks for your quick reply! I found the precomputed geodesic distances are saved in the PDIST_transform (am I right?) which should be a symmetric matrix, however, it is exactly a very long vector because its shape is (379597681, 1). So, I still have several questions which I hope you can give more explanations:

  1. Why does it need to take several steps to find the exact index k (i.e., k = (n*(n-1)/2) - (n-i)*((n-i)-1)/2 + j - i - 1; k = ( n*n - n )/2 -k -1);
  2. What's the meaning of variable i and j (is i represents the closest vertex index of gt_point or the geodesic distance between gt_point and its closest vertex? but why need to -1?), why should i swap with j if j > i;
  3. What's the meaning of variable n (I found 27554 is the largest value in PDIST_transform which the smallest value is 1);
vkhalidov commented 5 years ago

@Johnqczhang Precomputed geodesic distances between n points are stored using compact representation: for i<j the index k at which the distance d_ij is stored, is given by the formula you cite. Since d_ii=0 and d_ij = d_ji, this compact representation is sufficient to restore the whole matrix. As you mentioned, n=27554, so the number of elements in the lower / upper triangle of the matrix is exactly 379597681.

As for the formula, it can be simplified to k = offset(i, j) = i - j + j * (j + 1) / 2, where i<j are 0-based vertex indices.

To better understand the enumeration strategy of matrix elements, I suggest you to output k for all i and j, as given by the formula.

Johnqczhang commented 5 years ago

@vkhalidov Now I have a much clearer understanding, really appreciate your help!

First, it was my mistake that I wrongly viewed PDIST_transform as the matrix (which in fact should be Pdist_matrix) in my previous reply. PDIST_transform saves indices for getting values from the matrix.

Then, n=27554 is the number of points from which the geodesic distance between each two points are precomputed and stored in Pdist_matrix, so n is also the dimension of the matrix (of size n x n). The number of elements in the lower/upper triangle of the matrix is n * (n + 1) / 2 = 379625235. However, because there are n elements on the diagonal which are all zeros, so 379597681 non-zero values are finally stored in the compact representation of the matrix.

From the implementation of this function, it looks like Pdist_matrix stores the lower triangle of the matrix except the diagonal elements which are all zeros. The order of elements stored in Pdist_matrix is [d_10, d_20, d_21, d_30, d_31, d_32, ..., d_(n-1)(n-2)]. Therefore, for the distance d_ij (i, j are 0-based index):

One last question, it looks like variables cVerts and cVertsGT record indices from PDIST_transform which is 1-based, which is why it needs to minus 1 when gets i and j (same reason for the first 2 lines of this function). cVerts and cVertsGT come from the previous function findAllClosestVerts, which finds closest vertex indices from Part_ClosestVertInds for points in each body part. So, can you tell me what's the difference between the indices stored in PDIST_transform and the indices stored in Part_ClosestVertInds? BTW, I think it's better to rename these two variables as cVertInds and cVertIndsGT.