kcyt / SPIFu

Official Implementation of SPIFu (NeurIPS 2022 )
7 stars 1 forks source link

Which one corresponds to the coordinates information? #5

Closed Charlulote closed 1 year ago

Charlulote commented 1 year ago

In line 468-470 of Thuman_dataset.py, it seems to be assigned to constent self.coordinate_matrix, which is initialized as a grid of rays in constructor. So which one corresponds to the coordinates information mentioned in the paper?

# (__init__)
if self.opt.use_unrolled_smpl:
    # compute coordinate_matrix

    XY, step_size = np.linspace(-1,1, self.opt.output_resolution_smpl_map+1, endpoint=True, retstep=True) # XY has shape of (self.opt.output_resolution_smpl_map+1,) 
    XY = XY + 0.5*step_size
    XY = XY[:-1] # remove last element. Shape of (self.opt.output_resolution_smpl_map,) 

    XY = np.meshgrid(XY,XY) # # A list of 2 coordinate matrix
    XY = np.array(XY).T  # Shape of [self.opt.output_resolution_smpl_map, self.opt.output_resolution_smpl_map, 2], where the '2' is y coordinate first then x.

    # Shape of [self.opt.output_resolution_smpl_map, self.opt.output_resolution_smpl_map, NUM_OF_FACES , 2]
    self.coordinate_matrix = np.repeat(XY[:,:,None,:], repeats=NUM_OF_FACES , axis=2 )
...
# (__get_item__)
features_array = None 
if self.opt.use_coordinates_smpl :
    features_array = np.concatenate([self.coordinate_matrix, smpl_depth_values], axis=3)
kcyt commented 1 year ago

Because we are looking at coordinates in the 3D camera space, the x and y coordinates will be constant for each given input RGB image (e.g. if we define the top left corner of an image as (-1,-1), then the top left corner will always be (-1,-1) for all of the images. Same goes for other locations on an image). However, the z coordinate will not be the same for each image because the orientation of the smplx mesh corresponding to each image is not the same. Thus, z-coordinate at, for example, the bottom right hand corner can be 0.5 for image 1 but 0.8 for image 2.

Charlulote commented 1 year ago

Well, this leaves me puzzled, as your explanation doesn't seem to align with my understanding of the paper's description. However, it may be unnecessary to delve deeper into this issue. Anyway, thanks.