facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.7k stars 1.3k forks source link

Odd Behavior during Optimize Via Depth Map #1699

Closed bfialkoff closed 9 months ago

bfialkoff commented 9 months ago

I don't know if this qualifies as an issue. I am trying to optimize the parameters that define a mesh by minimizing the depth map loss.

Similar to this issue.

Basically: $\underset{weights}{argmin} ||gt \textunderscore depth \textunderscore map - render \textunderscore depth(get \textunderscore mesh(weights))||$ The idea is that I can express a new mesh as some linear combination of PCA decomposition components that accurately describe my shape space (like an extremely simple version of SMPL) and compare it to a ground truth depth map. I note this because the mesh is linear in the parameters so there shouldn't be any issues with differentiation. For some reason, I can't figure out, the loss grows and grows. The optimization completely fails. Here is the relevant code:

def get_mesh(self, weights: torch.Tensor, scale=1.):
        weights = torch.atleast_2d(weights)
        num_meshes = weights.shape[0]
        shape_space = self.vector_space_basis.type(weights.dtype)
        assert self.vector_space_basis.shape[0] == weights.shape[1]

        displacement_vertices = torch.mm(weights, vector_space_basis).view(weights.shape[0], -1, 3)
        new_veritices = self.average_mesh_torch.verts_packed().unsqueeze(0) + displacement_vertices
        out_mesh = Meshes(verts=new_veritices * scale, faces=torch.cat(num_meshes * [self.average_mesh_torch.faces_packed().unsqueeze(0)]))
        return out_mesh
R, T = look_at_view_transform(3, 0, 0)
cameras = FoVPerspectiveCameras(R=R, T=T)
sigma = 1e-4
raster_settings = RasterizationSettings(
    image_size=64,
    blur_radius=np.log(1. / 1e-4 - 1.) * sigma,
    faces_per_pixel=25,
)
rasterizer = MeshRasterizer(cameras=cameras, raster_settings=raster_settings)

renderer_depth = MeshRenderer(
    rasterizer=rasterizer,
    shader=SoftDepthShader(cameras=cameras))
weights = torch.full_like(gt_weights, fill_value=0.5, device=device, requires_grad=True)
optimizer = torch.optim.SGD([weights], lr=0.5, momentum=0.9)
loss_fun = torch.nn.SmoothL1Loss()
loop = tqdm(range(500))
for i in loop:
    optimizer.zero_grad()
    pred_mesh = get_mesh(weights, dummy_scale, use_reduced_space=True)
    pred_depth_map = renderer_depth(pred_mesh)
    loss = loss_fun(gt_depth_map, pred_depth_map)
    loop.set_description('total_loss = {:.6f}, {}'.format(loss, weights))
    loss.backward()
    optimizer.step()

# gt_weights = [-2.0795, -1.5183, -0.3542, -0.5209, -0.3234]
# 0/500 total_loss = 2.578646,  [0., 0., 0., 0., 0.]
# 1/500 total_loss = 59.460693, [ -59.9297,  -57.5586,   56.3358,   81.5205, -232.6749]
# 2/500 total_loss = 58.091270, [-113.8686, -109.3969,  107.0190,  154.9005, -442.0728]
# 3/500 total_loss = 56.877048, [-162.4156, -156.0623,  152.6280,  220.9357, -630.5349]
# 4/500 total_loss = 55.864033, [-206.1097, -198.0710,  193.6692,  280.3570, -800.1563]
# 5/500 total_loss = 54.976013, [-245.4359, -235.8885,  230.5983,  333.8248, -952.8216]

I tried all sorts of things like reducing/increasing learning rate, changing optimizers, scaling weights. Nothing works, no matter what I do the loss increases.

After confirming that there arent any obvious bugs in the code, renderers configured correctly, etc and that my depth maps look normal. I ran two tests to check if this methodology can even work.

I made a similar (albeit not equivalent) problem where I made a function that receives (length, width, height) and uses them to define a pyramid. And I used this approach to find the paramters of the pyramid with whose depth map minimizes the loss. This worked like a charm.

def create_pyramid(params):
    params = params.float()

    vertices = params * torch.tensor([
        [-1, -1, 0],
        [1, -1, 0],
        [1, 1, 0],
        [-1, 1, 0],
        [0, 0, 1],
    ], dtype=params.dtype, requires_grad=True)

    faces = torch.tensor([
        [0, 1, 4],  # side
        [1, 2, 4],  # side
        [2, 3, 4],  # side
        [3, 0, 4],  # side
        [0, 1, 2],  # base
        [0, 2, 3]  # base
    ], dtype=torch.int64)
    pyramid_mesh = Meshes(verts=[vertices], faces=[faces])
    return pyramid_mesh

weights = torch.full((3,), fill_value=1., device=device, requires_grad=True)
optimizer = torch.optim.SGD([weights], lr=.05, momentum=0.9)
loss_fun = torch.nn.SmoothL1Loss()
loop = tqdm(range(5000))
for i in loop:
    optimizer.zero_grad()
    pred_mesh = create_pyramid(weights, textures=gt_mesh.textures)
    pred_depth_map = renderer_depth(pred_mesh)
    loss = loss_fun(gt_depth_map, pred_depth_map)
    loop.set_description('total_loss = %.6f' % loss)
    loss.backward()
    optimizer.step()

The other thing I did was solve this using scipy's least_squares. I expected this to fail completely. Though it's not a perfect solution it surprisingly far outperforms the pytorch optimization above.

def cost(weights, gt_depth_map, renderer, scale):
    pred_mesh = get_mesh(weights, scale=scale)
    pred_depth_map = renderer(pred_mesh).numpy()[0, ..., 0]
    return np.abs(pred_depth_map - gt_depth_map).reshape(-1)

res = least_squares(cost, np.zeros(n_components), args=(gt_depth_map, renderer_depth, dummy_scale), loss='soft_l1')
print(gt_weights, res.x)
# [-2.079  -1.518  -0.354 -0.521   -0.323]]
# [-2.134 -0.575 -0.222 -0.330 -0.322]

That's the reason I'm posting here. I've been through all the examples and I've read a whole lot of documentation. I can't shake the feeling that there is something I'm missing.

bottler commented 9 months ago

I'm afraid this is a modelling question we can't help with. It may be the code not doing what you think it does or gradient not being enough.