ClementPinard / SfmLearner-Pytorch

Pytorch version of SfmLearner from Tinghui Zhou et al.
MIT License
1.01k stars 224 forks source link

What happens if I use 3 or more frames? #130

Closed yangbinchao closed 2 years ago

yangbinchao commented 3 years ago

Thank you for your excellent work. Recently I was thinking about the pose estimation in the paper using 5 frames for training and testing. What happens if I use 3 or more frames? I look forward to your detailed answer to my confusion, thanks!

ClementPinard commented 3 years ago

Hi, Everything happens the same, the code is supposedly flexible to the nimber of frames in a snippet. The thumb rule is that the lower the number of frames, the easier the algorithm will converge. This is because the displacement of pixels between frames is very small, and thus the photometric loss is always meaningful. However, it will make both networks less precise because parallax will also be very small. And finally, there are diminishing returns in adding more frames in a snippet, because at some point the pixel displacement is so high that a high of pixels are not seen on both frames. There were some test with 7-frames snippets on KITTI but it was only marginally bette than 5 frames and much slower.

As such, it all depends on your dataset, what displacement there is between two frames, what parallax you usually get between two frames, but generally when your training is stable enough, you should try higher values, until you see no improvement.