Closed everythoughthelps closed 11 months ago
Do you mean in this paper ? https://openaccess.thecvf.com/content_cvpr_2017/papers/Zhou_Unsupervised_Learning_of_CVPR_2017_paper.pdf
The equation 2 says
Where D_t is the estimation of target depth, so no reference depth here.
Or maybe you are referring to another equation ?
yes, exactly this paper, I didn't express myself clearly, we use $r$ and $t$ index the image in equation2: $$pt = K T{r \rightarrow t} D_r(p_r) K^{-1} p_r$$, which is more close to your code. The thing that confuse me a lot is: according to this equation, you are supposed to use the ref_depth $D_r$ to generate the fake tgt_img right? but you use the tgt_depth, which is output from the dispnet(tgt_img), to generate the fake tgt_img
One reason I can figure is that there are (sequence number -1) ref_depths are used to generate a fake tgt_img, this consumes a lot of time so you use the tgt_depth approxing ref_depths to save time, is that right?
I think your confusion is that this equation describes how we can reconstruct target image by getting colors from reference image and not the other around
the p_s is indeed referred to coordinates in reference image, but it tells where to pick the color for the pixel that will be at the coordinate p_t, which is in the target image.
This is also the reason why this operation is called inverse warp and not simply warp. target depth is used to reconstruct target image, even though we already know target depth since we used it to get the depth.
I get it!, thanks for your instant reply!
According to equation 2 in the paper, we require the reference depth for the reference image to create a synthesis target image. However, in your code, you have used tgt_depth instead of ref_depth which has caused confusion for me. Can you kindly clarify this for me? I would greatly appreciate your response. Thank you!