zju3dv / disprcnn

Code release for Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation (CVPR 2020, TPAMI 2021)
Apache License 2.0
213 stars 36 forks source link

Is it possible to modify your model to monocular version? #14

Closed TengFeiHan0 closed 4 years ago

TengFeiHan0 commented 4 years ago

Thanks for your great work, which inspires me a lot. I wonder that is it feasible to use MaskRCNN and a monocular depth estimation network to reproduce your results? furtherly, could we consider train two models jointly? would you mind give me some suggestions? @f-sky

ootts commented 4 years ago

Of course, it is possible to replace the iDispNet with a monocular depth estimation network, but I believe the performance will drop a lot since a monocular image could not provide enough information to predict accurate depth. Moreover, it is possible to train the Mask R-CNN and depth estimation network jointly by reusing the feature from RPN. But I think the performance would drop a little comparing to train the depth estimation network separately, as discusses in Sec3.4 in our paper.

TengFeiHan0 commented 4 years ago

Some SOTA monocular depth estimation methods achieve comparable results compared with stereo methods. As for the second problem, could we use an anchor-free algorithm? for example FCOS. we could use multi-level features from our backbone to predict the disparity. I'm not sure it will make sense, just a guess.

TengFeiHan0 commented 4 years ago

Besides, where to download split_set? if possible, would you mind sharing these files with me? @f-sky

ootts commented 4 years ago

You can replace the modules in the Disp R-CNN framework as long as the functions are maintained, such as using a anchor-free 2D detector and pool the features to build the cost volume.

ootts commented 4 years ago

I have added the split set to this repo.