Unable to reproduce the results shown in the bundle_adjustment tutorial - Githubissues

facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

https://pytorch3d.org/

Other

8.82k stars 1.32k forks source link

Unable to reproduce the results shown in the bundle_adjustment tutorial #1071

Open LoicFerrot opened 2 years ago

LoicFerrot commented 2 years ago

🐛 Bugs / Unexpected behaviors

Unable to reproduce the results shown in the plots of the bundle adjustment tutorial, with the script unmodified.

Instructions To Reproduce the Issue:

The exact command(s) you ran: Go to bundle adjustment tutorial, click on run in google colab, run all cells.

What you observed: This imprecise result to compare with the "expected result" from the notebook introduction:

Just to be sure that this wasn't "bad luck" with the random seed, I commented the line torch.manual_seed(42) in the second code cell and ran the last cell >20 times. Each time I got a different result, and in only one case was the result matching perfectly the ground truth. In the other cases the final estimation was qualitatively as bad as in the screenshot above. Maybe I also had bad luck with my trials, but anyways I wanted to let you know about this discrepancy between expected and actual behavior.

Correction proposition Using 4000 iterations instead of 2000 and increasing the learning rate from 0.1 to 0.4 does help to reach the correct solution, as with the initial parameters the poses had not finished to converge at 2000 iterations (by looking at the plots). The correct solution is however not consistently reached (e.g. running once with manual seed = 0 and then running a second time doesn't converge completely)

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 2 years ago

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

IamShubhamGupto commented 1 year ago

I was running the notebook as is clicking on the Run on Colab button.

Here's a stacks trace:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-8-32ab0bf44a42>](https://2t2zcboxyi6-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230601-060133-RC01_537142640#) in <cell line: 27>()
     59     if it % 200==0 or it==n_iter-1:
     60         status = 'iteration=%3d; camera_distance=%1.3e' % (it, camera_distance)
---> 61         plot_camera_scene(cameras_absolute, cameras_absolute_gt, status)
     62 
     63 print('Optimization finished.')

[/content/camera_visualization.py](https://2t2zcboxyi6-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab-20230601-060133-RC01_537142640#) in plot_camera_scene(cameras, cameras_gt, status)
     34     """
     35     fig = plt.figure()
---> 36     ax = fig.gca(projection="3d")
     37     ax.clear()
     38     ax.set_title(status)

TypeError: FigureBase.gca() got an unexpected keyword argument 'projection'

corresponding cell:

# initialize the absolute log-rotations/translations with random entries
log_R_absolute_init = torch.randn(N, 3, dtype=torch.float32, device=device)
T_absolute_init = torch.randn(N, 3, dtype=torch.float32, device=device)

# furthermore, we know that the first camera is a trivial one 
#    (see the description above)
log_R_absolute_init[0, :] = 0.
T_absolute_init[0, :] = 0.

# instantiate a copy of the initialization of log_R / T
log_R_absolute = log_R_absolute_init.clone().detach()
log_R_absolute.requires_grad = True
T_absolute = T_absolute_init.clone().detach()
T_absolute.requires_grad = True

# the mask the specifies which cameras are going to be optimized
#     (since we know the first camera is already correct, 
#      we only optimize over the 2nd-to-last cameras)
camera_mask = torch.ones(N, 1, dtype=torch.float32, device=device)
camera_mask[0] = 0.

# init the optimizer
optimizer = torch.optim.SGD([log_R_absolute, T_absolute], lr=.1, momentum=0.9)

# run the optimization
n_iter = 2000  # fix the number of iterations
for it in range(n_iter):
    # re-init the optimizer gradients
    optimizer.zero_grad()

    # compute the absolute camera rotations as 
    # an exponential map of the logarithms (=axis-angles)
    # of the absolute rotations
    R_absolute = so3_exp_map(log_R_absolute * camera_mask)

    # get the current absolute cameras
    cameras_absolute = SfMPerspectiveCameras(
        R = R_absolute,
        T = T_absolute * camera_mask,
        device = device,
    )

    # compute the relative cameras as a composition of the absolute cameras
    cameras_relative_composed = \
        get_relative_camera(cameras_absolute, relative_edges)

    # compare the composed cameras with the ground truth relative cameras
    # camera_distance corresponds to $d$ from the description
    camera_distance = \
        calc_camera_distance(cameras_relative_composed, cameras_relative)

    # our loss function is the camera_distance
    camera_distance.backward()

    # apply the gradients
    optimizer.step()

    # plot and print status message
    if it % 200==0 or it==n_iter-1:
        status = 'iteration=%3d; camera_distance=%1.3e' % (it, camera_distance)
        plot_camera_scene(cameras_absolute, cameras_absolute_gt, status)

print('Optimization finished.')

bottler commented 1 year ago

@IamShubhamGupto Thanks for reporting. You are seeing a separate, more fundamental problem, that the plotting code doesn't work at all with newer matplotlib. I've opened https://github.com/facebookresearch/pytorch3d/issues/1554 to track that problem. I hope to fix it soon. I'm leaving this issue open to track the original problem, which is the numerical solution not working very well.