chensong1995 / HybridPose

HybridPose: 6D Object Pose Estimation under Hybrid Representation (CVPR 2020)
MIT License
412 stars 64 forks source link

/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt v = np.matrix(np.diag(1. / np.sqrt(v))) #81

Closed monajalal closed 9 months ago

monajalal commented 10 months ago

Is this ok or should it be resolved?

/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/anaconda3/envs/hybridpose/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/mona/HybridPose/lib/ransac_voting_gpu_layer/ransac_voting_gpu.py:546: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at ../aten/src/ATen/native/IndexingUtils.h:27.)
  direct = vertex[bi].masked_select(torch.unsqueeze(torch.unsqueeze(cur_mask, 2), 3))  # [tn,vn,2]
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/anaconda3/envs/hybridpose/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
/home/mona/HybridPose/lib/ransac_voting_gpu_layer/ransac_voting_gpu.py:546: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at ../aten/src/ATen/native/IndexingUtils.h:27.)
  direct = vertex[bi].masked_select(torch.unsqueeze(torch.unsqueeze(cur_mask, 2), 3))  # [tn,vn,2]
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
/home/mona/HybridPose/./trainers/coretrainer.py:403: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
chensong1995 commented 10 months ago

Hi Mona,

Thanks for your question! Can you set a breakpoint and take a look at the value of v?

monajalal commented 10 months ago

It takes a really long time to get to that point but I will do it shortly. I will also create a separate related issue in the meanwhile.

monajalal commented 10 months ago

I am still working on providing you results but I also noticed this which also looks troublesome:

Epoch: [19][16/19]      Time: 0.289 (0.258)     Sym: 8.0534 (8.4049)    Mask: 0.0785 (0.0957)   Pts: 0.0528 (0.0496)    Graph: 10.5922 (11.1342)        Total: 2.4711 (2.5460)
Epoch: [19][17/19]      Time: 0.215 (0.255)     Sym: 8.4763 (8.4089)    Mask: 0.1002 (0.0959)   Pts: 0.0487 (0.0496)    Graph: 11.7995 (11.1711)        Total: 2.6145 (2.5498)
Epoch: [19][18/19]      Time: 0.097 (0.247)     Sym: 10.1351 (8.4645)   Mask: 0.1134 (0.0965)   Pts: 0.0475 (0.0495)    Graph: 12.6410 (11.2185)        Total: 2.8660 (2.5600)
Testing...
/home/mona/HybridPose/lib/ransac_voting_gpu_layer/ransac_voting_gpu.py:546: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (Triggered internally at ../aten/src/ATen/native/IndexingUtils.h:27.)
  direct = vertex[bi].masked_select(torch.unsqueeze(torch.unsqueeze(cur_mask, 2), 3))  # [tn,vn,2]
/home/mona/HybridPose/./trainers/coretrainer.py:230: RuntimeWarning: Mean of empty slice.
  edge_x = graph_pred[i_edge, 0][mask_pred == 1.].mean()
/home/mona/anaconda3/envs/hybridpose/lib/python3.10/site-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in divide
  ret = ret.dtype.type(ret / rcount)
/home/mona/HybridPose/./trainers/coretrainer.py:231: RuntimeWarning: Mean of empty slice.
  edge_y = graph_pred[i_edge, 1][mask_pred == 1.].mean()
/home/mona/HybridPose/./trainers/coretrainer.py:233: RuntimeWarning: invalid value encountered in cast
  end = np.int16(np.round(pts2d_gt[start_idx] + edge))
monajalal commented 10 months ago
Epoch: [199][16/19]     Time: 0.451 (0.312)     Sym: 4.0442 (3.8591)    Mask: 0.0055 (0.0088)   Pts: 0.0150 (0.0162)    Graph: 1.7577 (2.1512)  Total: 0.7361 (0.7721)
Epoch: [199][17/19]     Time: 0.275 (0.310)     Sym: 4.7510 (3.9087)    Mask: 0.0066 (0.0087)   Pts: 0.0143 (0.0161)    Graph: 2.4858 (2.1698)  Total: 0.8730 (0.7777)
Epoch: [199][18/19]     Time: 0.171 (0.303)     Sym: 2.5082 (3.8635)    Mask: 0.0070 (0.0086)   Pts: 0.0145 (0.0161)    Graph: 1.2050 (2.1387)  Total: 0.5235 (0.7695)
Testing...
Loss: 0.6235
Successfully saved model into saved_weights/linemod/ape/checkpoints/0.001/199
value of v is:  [6.1583250e+18 7.8760837e+09]
> /home/mona/HybridPose/trainers/coretrainer.py(411)fill_intermediate_predictions()
-> if np.any(v < 0):
(Pdb) v
array([6.1583250e+18, 7.8760837e+09], dtype=float32)
(Pdb) n
> /home/mona/HybridPose/trainers/coretrainer.py(415)fill_intermediate_predictions()
-> if np.any(np.isnan(v)) or np.any(np.isinf(v)):
(Pdb) n
> /home/mona/HybridPose/trainers/coretrainer.py(417)fill_intermediate_predictions()
-> v = np.matrix(np.diag(1. / np.sqrt(v)))
(Pdb) np.sqrt(v)
array([2.4815972e+09, 8.8747305e+04], dtype=float32)
(Pdb) v
array([6.1583250e+18, 7.8760837e+09], dtype=float32)
(Pdb) n
> /home/mona/HybridPose/trainers/coretrainer.py(418)fill_intermediate_predictions()
-> point_inv_half_var[i] = u * v * u.transpose()
(Pdb) v
matrix([[4.0296630e-10, 0.0000000e+00],
        [0.0000000e+00, 1.1267948e-05]], dtype=float32)

@chensong1995 is this helpful?

monajalal commented 10 months ago

my bad, here is the pdb.set_trace() for when if np.any(v < 0):

Epoch: [199][15/19]     Time: 0.289 (0.295)     Sym: 4.3407 (3.7749)    Mask: 0.0074 (0.0089)   Pts: 0.0170 (0.0160)    Graph: 1.9972 (2.0208)  Total: 0.8114 (0.7480)
Epoch: [199][16/19]     Time: 0.448 (0.304)     Sym: 3.9754 (3.7867)    Mask: 0.0057 (0.0087)   Pts: 0.0160 (0.0160)    Graph: 1.5828 (1.9950)  Total: 0.7215 (0.7465)
Epoch: [199][17/19]     Time: 0.260 (0.302)     Sym: 4.7488 (3.8401)    Mask: 0.0072 (0.0086)   Pts: 0.0143 (0.0159)    Graph: 2.4331 (2.0194)  Total: 0.8682 (0.7532)
Epoch: [199][18/19]     Time: 0.172 (0.295)     Sym: 2.4919 (3.7966)    Mask: 0.0077 (0.0086)   Pts: 0.0130 (0.0158)    Graph: 1.3690 (1.9984)  Total: 0.5236 (0.7458)
Testing...
Loss: 0.6320
Successfully saved model into saved_weights/linemod/ape/checkpoints/0.001/199
value of v is:  [ 7.4060767e+18 -1.6713240e+11]
> /home/mona/HybridPose/trainers/coretrainer.py(413)fill_intermediate_predictions()
-> print("Covariance matrix has negative eigenvalues")
(Pdb) l
408                 print('value of v is: ', v)
409  
410                 # Check eigenvalues
411                 if np.any(v < 0):
412                     pdb.set_trace()
413  ->                 print("Covariance matrix has negative eigenvalues")
414  
415                 # Check eigenvectors
416                 if np.any(np.isnan(v)) or np.any(np.isinf(v)):
417                     print("Eigenvectors have NaNs or Infs")
418                 v = np.matrix(np.diag(1. / np.sqrt(v)))
(Pdb) n
Covariance matrix has negative eigenvalues
> /home/mona/HybridPose/trainers/coretrainer.py(416)fill_intermediate_predictions()
-> if np.any(np.isnan(v)) or np.any(np.isinf(v)):
(Pdb) n
> /home/mona/HybridPose/trainers/coretrainer.py(418)fill_intermediate_predictions()
-> v = np.matrix(np.diag(1. / np.sqrt(v)))
(Pdb) n
/home/mona/HybridPose/./trainers/coretrainer.py:418: RuntimeWarning: invalid value encountered in sqrt
  v = np.matrix(np.diag(1. / np.sqrt(v)))
> /home/mona/HybridPose/trainers/coretrainer.py(419)fill_intermediate_predictions()
-> point_inv_half_var[i] = u * v * u.transpose()
(Pdb) v
matrix([[3.6745645e-10, 0.0000000e+00],
        [0.0000000e+00,           nan]], dtype=float32)

as you see we end up with a nan since v has negative values.

When this happens:

(Pdb) cov
matrix([[6.5462130e+18, 2.3725201e+18],
        [2.3725201e+18, 8.5986365e+17]], dtype=float32)

and as you see there is no inf or nan values inside cov matrix.

With this information, would you please be able to provide a solution? Here the covariance matrix is not positive definite hence we have negative eigenvalues. Given the situation, I am not sure what can be done here?

chensong1995 commented 10 months ago

Hi Mona,

Can you confirm you have downloaded the pre-trained weights as instructed by README and set the load_dir (here) to the downloaded path?

monajalal commented 10 months ago

This problem happens even with the newly and checked with md5sum weights unfortunately.

chensong1995 commented 10 months ago

Hi Mona,

Thanks for your question! Can you run three iterations of trainer.test (here) and see if the visualizations look good to you?