Open gkiavash opened 1 year ago
In order to test if giving an initial location can reduce the time while preserving the quality of the point cloud, first, a new dataset with the same images from #8 is chosen. Then, by following colmap doc, initial positions are given to each image. At first, positions were obtained from the reconstruction in #8 with some manual permutations in order to see if it works. Then, other visual odometry applications are used to calculate the initial positions.
It is observed that the time of the whole time decreased from 2 hours to 6 minutes, and the point cloud with the same quality can be obtained. The permuted camera poses were refined, and the number of bundle adjustment steps decreased significantly.
There are several approaches to visual odometry. I could successfully obtain good results from two approaches:
I tried to implement orbslam2 & 3, and vso. But, their installations were too complex, full of versioning conflicts (my tries). Nonetheless, I found some docker images with already installed orbslam3. However, still, I couldn't get the results. I am still looking for better visual slam codes
Having initial guesses for camera poses can reduce the execution time, significantly. Also, I have been working on adding frames sequentially (here). I succeeded in forcing the pipeline to register only the next frame at each step. However, it doesn't have enough quality, yet. I am working on putting the initial pose and sequential frames together.
For the demo, the same dataset described in #8 is used. First, I extracted camera poses from colmap itself by reducing the number of features. It took 15 minutes. Then, I used the camera poses for the main high quality construction. This time, it took only 4 minutes.
Here are some screenshots of the final point cloud which can be compared to https://github.com/gkiavash/Master-Thesis-Structure-from-Motion/issues/8#issuecomment-1374847826
As you can see, the poses are wrong. The head of the camera is tilted to the left, while it should be forward
Introduction
The basic idea is to enhance the speed of SfM pipeline in COLMAP.
Currently, high-quality structure from motion pipelines is too far from real-time usage. For example, in our previous experiment (#8), the dataset contains 175 distorted frames with a 3 fps ratio which means 58 seconds. It took 2 hours to compute the final point cloud and camera poses. It takes much longer when the images are distorted and the camera parameters are given.
The main problem
After investigating the execution time and the logs from the COLMAP software, I realized that the most time-consuming parts are finding the initial pair and registering the next image. Since the images are sequential video frames, the order of registering the next images is known. Also, by having the initial guessed position of the next frame, there would be less error in bundle adjustment, and it converges faster.