TRI-ML / packnet-sfm

TRI-ML Monocular Depth Estimation Repository
https://tri-ml.github.io/packnet-sfm/
MIT License
1.24k stars 243 forks source link

Training of VelSupModel #91

Closed surfii3z closed 3 years ago

surfii3z commented 3 years ago

Hi all,

tl;dr Do we need to provide ground truth pose instead of just velocity magnitude in VelSupModel?


I have been experimenting with this repo and was impressed with the depth result.

Now I plan to train the scale-aware depth network on my custom dataset.

I have a question regarding the training of VelSupModel.

From this code,

https://github.com/TRI-ML/packnet-sfm/blob/dfbdc27202075b500577c64d3f0d6c8438b86cfd/packnet_sfm/losses/velocity_loss.py#L10-L42

the velocity loss is calculated by comparing predicted pose from NN and the GT pose, which is different from the paper that calculate from the different of the magnitude of translation and velocity.

Have anyone try to train this VelSupModel? If so, what information you used for the supervision?

VitorGuizilini-TRI commented 3 years ago

Hi, thank you for the interest in our repository! You don't need to provide ground-truth pose, only instantaneous velocity. We provide the full 4x4 matrix because that is available, but we don't use it. Our pred_trans and gt_trans only take the last column, which contains translation, and that's what is being used to calculate the loss.

surfii3z commented 3 years ago

Thanks for your reply. I totally miss that part.