Closed shawLyu closed 4 years ago
Hi @shawLyu, thanks for pointing it out, that's an interesting question. The main difference I've found between the two is that D3VO also estimates the brightness transformation parameters between the different frames. This may have an impact
Hi @mattpoggi Thanks for your reply, I will do this experiment next.
I forgot to mention that, according to D3VO paper, "DepthNet also predicts the depth map D{t^s} of the right image I{t^s}". This can also make a difference.
Hi @mattpoggi
Thanks for your innovative work. I had the same confusion before, but after conducting many experiments, I found there might be a potential issue in the implementation (not sure about it as both mono-uncertainty and D3VO did not release their code).
In my opinion, in the part of calculating the loss of Log, the shape of the to_optimse
should be the same as the uncertainty
.
(Pdb) to_optimise.shape
torch.Size([8, 192, 640])
(Pdb) uncer.shape
torch.Size([8, 1, 192, 640])
(Pdb) (to_optimise / uncer + torch.log(uncer)).shape
torch.Size([8, 8, 192, 640])
However, even if the shape is not a perfect match, the operation is still legal, as shown above, and could lead to the results like yours. On the other side, D3VO is doing it in the same shape and the results look totally different. Note that the following networks are using pure monodepth2 with a different shape of uncertainty, no extra skills (brightness transformation, right disparity prediction, or augmentation) are used.
Please let me know if I have any misunderstanding about your paper, thank you.
Hi, thanks for your great work. I noticed that there were two work for MDE in CVPR20 using uncertainty loss, another work was D3VO. Both of you used the same uncertainty loss (log section in your paper), but gotten totally different uncertainty map. I can get uncertainty map as yours. So I‘d like to ask if you know the reason. Looking forward to your reply. Thanks.