Closed TengFeiHan0 closed 4 years ago
Of course, it is possible to replace the iDispNet with a monocular depth estimation network, but I believe the performance will drop a lot since a monocular image could not provide enough information to predict accurate depth. Moreover, it is possible to train the Mask R-CNN and depth estimation network jointly by reusing the feature from RPN. But I think the performance would drop a little comparing to train the depth estimation network separately, as discusses in Sec3.4 in our paper.
Some SOTA monocular depth estimation methods achieve comparable results compared with stereo methods. As for the second problem, could we use an anchor-free algorithm? for example FCOS. we could use multi-level features from our backbone to predict the disparity. I'm not sure it will make sense, just a guess.
Besides, where to download split_set
? if possible, would you mind sharing these files with me? @f-sky
You can replace the modules in the Disp R-CNN framework as long as the functions are maintained, such as using a anchor-free 2D detector and pool the features to build the cost volume.
I have added the split set to this repo.
Thanks for your great work, which inspires me a lot. I wonder that is it feasible to use MaskRCNN and a monocular depth estimation network to reproduce your results? furtherly, could we consider train two models jointly? would you mind give me some suggestions? @f-sky